SlideShare a Scribd company logo
1 of 19
WEB Click Stream Analysis
Content
• Objective
• Description of the data
• Exploratory data analysis
• Model building
– Sequence rules
– Link analysis
– Probabilistic expert systems
– Markov chains
• Model comparison
• Summary report
Introduction
• Visitor behaviour on a website can be
predicted by analysing existing data on the
order in which the site’s webpages are visited
• Click flow is captured
• Every click of the mouse corresponds to the
viewing of a webpage.
• Clickstream as the sequence of webpages
requested
Objective
• To understand the most likely navigation paths in a website, with
the aim of
• predicting, possibly online, which pages a visitor will view, given the
path they
• have taken so far.
• Thisis be very useful in finding the probability that a visitor will view
a certain page, perhaps a buying page in an e-commerce site.
• It can also find the probability of entering (or exiting) the website
from any particular page.
• Note that since most pages are now dynamically generated, the
idea of viewing a particular page may need to be replaced with the
idea of viewing a particular class of page, or type of page; a class
could be defined by meta information in the header
Description of the data
• Data set is present in log file
• A log file for a period of about two years, 30
September 1997 to 30 June 1999.
• This data set contains the userid (c value), a
variable with the date and the instant the
visitor has linked to a specific page (c time)
and the webpage seen (c caller)
• Data set contains 250 711 observations, each
corresponding to a click, that describe the
navigation paths of 22 527 visitors among the 36
pages which compose the site of the webshop.
• The visitors are taken as unique; that is, no
visitors appears with more than one session. But
a page can occur more than once in the same
session. This data set is an example of a
transaction dataset.
Measures
• Box plot of start
further reduction in
the number of clusters leads to a
noticeable decrease in R2 and an
increase in
SPRSQ. This can be seen in Figure 8.3,
which plots R2 and SPRSQ versus the
number of groups in the hierarchical
agglomerative algorithm.
Model building
• Sequence rules
• Link analysis
• Probabilistic expert systems
• Markov chains
Sequence rule
Link analysis
• Take the results from the
sequence rules and use link
analysis to build up a global
model.
• Consider all indirect sequences of
any order up to a maximum of 10.
• Link analysis considers each of
the obtained sequences as a row
• Observation in a data set called
link. It then counts how many of
the observations include a certain
sequence. This is called the count
of a sequence and is the
fundamental measure for link
analysis.
1web click stream.pptx

More Related Content

Similar to 1web click stream.pptx

Clickstream Mining visualization for Ecommerce
Clickstream Mining visualization for EcommerceClickstream Mining visualization for Ecommerce
Clickstream Mining visualization for Ecommerceshraddha mane
 
Clickstream analytics with Markov Chains
Clickstream analytics with Markov ChainsClickstream analytics with Markov Chains
Clickstream analytics with Markov ChainsAlex Papageorgiou
 
New understanding website
New understanding websiteNew understanding website
New understanding websitereddvise
 
Mining web-logs-to-improve-website-organization1
Mining web-logs-to-improve-website-organization1Mining web-logs-to-improve-website-organization1
Mining web-logs-to-improve-website-organization1Ijcem Journal
 
BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...
BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...
BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...ijdkp
 
Web Analytics: Challenges in Data Modeling
Web Analytics: Challenges in Data ModelingWeb Analytics: Challenges in Data Modeling
Web Analytics: Challenges in Data ModelingExcella
 
Digital Marketing Course Week 6: Search Engine Optimization (SEO)
Digital Marketing Course Week 6: Search Engine Optimization (SEO)Digital Marketing Course Week 6: Search Engine Optimization (SEO)
Digital Marketing Course Week 6: Search Engine Optimization (SEO)Ayca Turhan
 
Data preparation for mining world wide web browsing patterns (1999)
Data preparation for mining world wide web browsing patterns (1999)Data preparation for mining world wide web browsing patterns (1999)
Data preparation for mining world wide web browsing patterns (1999)OUM SAOKOSAL
 
Big data visualization
Big data visualizationBig data visualization
Big data visualizationAnurag Gupta
 
Search engine and web crawler
Search engine and web crawlerSearch engine and web crawler
Search engine and web crawlervinay arora
 
Information Architecture
Information ArchitectureInformation Architecture
Information ArchitectureHenry Osborne
 

Similar to 1web click stream.pptx (20)

Clickstream Mining visualization for Ecommerce
Clickstream Mining visualization for EcommerceClickstream Mining visualization for Ecommerce
Clickstream Mining visualization for Ecommerce
 
Web Mining
Web MiningWeb Mining
Web Mining
 
Web mining
Web miningWeb mining
Web mining
 
IRT Unit_4.pptx
IRT Unit_4.pptxIRT Unit_4.pptx
IRT Unit_4.pptx
 
Web Crawlers
Web CrawlersWeb Crawlers
Web Crawlers
 
Seminar on crawler
Seminar on crawlerSeminar on crawler
Seminar on crawler
 
Clickstream analytics with Markov Chains
Clickstream analytics with Markov ChainsClickstream analytics with Markov Chains
Clickstream analytics with Markov Chains
 
New understanding website
New understanding websiteNew understanding website
New understanding website
 
Mining web-logs-to-improve-website-organization1
Mining web-logs-to-improve-website-organization1Mining web-logs-to-improve-website-organization1
Mining web-logs-to-improve-website-organization1
 
BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...
BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...
BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...
 
Data Mining-2023 (2).ppt
Data Mining-2023 (2).pptData Mining-2023 (2).ppt
Data Mining-2023 (2).ppt
 
Web Analytics: Challenges in Data Modeling
Web Analytics: Challenges in Data ModelingWeb Analytics: Challenges in Data Modeling
Web Analytics: Challenges in Data Modeling
 
Web mining
Web miningWeb mining
Web mining
 
Digital Marketing Course Week 6: Search Engine Optimization (SEO)
Digital Marketing Course Week 6: Search Engine Optimization (SEO)Digital Marketing Course Week 6: Search Engine Optimization (SEO)
Digital Marketing Course Week 6: Search Engine Optimization (SEO)
 
Data preparation for mining world wide web browsing patterns (1999)
Data preparation for mining world wide web browsing patterns (1999)Data preparation for mining world wide web browsing patterns (1999)
Data preparation for mining world wide web browsing patterns (1999)
 
Big data visualization
Big data visualizationBig data visualization
Big data visualization
 
WEB MINING.
WEB MINING.WEB MINING.
WEB MINING.
 
Search engine and web crawler
Search engine and web crawlerSearch engine and web crawler
Search engine and web crawler
 
Information Architecture
Information ArchitectureInformation Architecture
Information Architecture
 
Web crawler
Web crawlerWeb crawler
Web crawler
 

Recently uploaded

Augmented Reality (AR) with Augin Software.pptx
Augmented Reality (AR) with Augin Software.pptxAugmented Reality (AR) with Augin Software.pptx
Augmented Reality (AR) with Augin Software.pptxMustafa Ahmed
 
Databricks Generative AI Fundamentals .pdf
Databricks Generative AI Fundamentals  .pdfDatabricks Generative AI Fundamentals  .pdf
Databricks Generative AI Fundamentals .pdfVinayVadlagattu
 
Working Principle of Echo Sounder and Doppler Effect.pdf
Working Principle of Echo Sounder and Doppler Effect.pdfWorking Principle of Echo Sounder and Doppler Effect.pdf
Working Principle of Echo Sounder and Doppler Effect.pdfSkNahidulIslamShrabo
 
Overview of Transformation in Computer Graphics
Overview of Transformation in Computer GraphicsOverview of Transformation in Computer Graphics
Overview of Transformation in Computer GraphicsChandrakantDivate1
 
Computer Graphics Introduction To Curves
Computer Graphics Introduction To CurvesComputer Graphics Introduction To Curves
Computer Graphics Introduction To CurvesChandrakantDivate1
 
Study of Computer Hardware System using Block Diagram
Study of Computer Hardware System using Block DiagramStudy of Computer Hardware System using Block Diagram
Study of Computer Hardware System using Block DiagramChandrakantDivate1
 
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...josephjonse
 
Convergence of Robotics and Gen AI offers excellent opportunities for Entrepr...
Convergence of Robotics and Gen AI offers excellent opportunities for Entrepr...Convergence of Robotics and Gen AI offers excellent opportunities for Entrepr...
Convergence of Robotics and Gen AI offers excellent opportunities for Entrepr...ssuserdfc773
 
Autodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptxAutodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptxMustafa Ahmed
 
Dynamo Scripts for Task IDs and Space Naming.pptx
Dynamo Scripts for Task IDs and Space Naming.pptxDynamo Scripts for Task IDs and Space Naming.pptx
Dynamo Scripts for Task IDs and Space Naming.pptxMustafa Ahmed
 
Independent Solar-Powered Electric Vehicle Charging Station
Independent Solar-Powered Electric Vehicle Charging StationIndependent Solar-Powered Electric Vehicle Charging Station
Independent Solar-Powered Electric Vehicle Charging Stationsiddharthteach18
 
01-vogelsanger-stanag-4178-ed-2-the-new-nato-standard-for-nitrocellulose-test...
01-vogelsanger-stanag-4178-ed-2-the-new-nato-standard-for-nitrocellulose-test...01-vogelsanger-stanag-4178-ed-2-the-new-nato-standard-for-nitrocellulose-test...
01-vogelsanger-stanag-4178-ed-2-the-new-nato-standard-for-nitrocellulose-test...AshwaniAnuragi1
 
Introduction-to- Metrology and Quality.pptx
Introduction-to- Metrology and Quality.pptxIntroduction-to- Metrology and Quality.pptx
Introduction-to- Metrology and Quality.pptxProfASKolap
 
Path loss model, OKUMURA Model, Hata Model
Path loss model, OKUMURA Model, Hata ModelPath loss model, OKUMURA Model, Hata Model
Path loss model, OKUMURA Model, Hata ModelDrAjayKumarYadav4
 
handbook on reinforce concrete and detailing
handbook on reinforce concrete and detailinghandbook on reinforce concrete and detailing
handbook on reinforce concrete and detailingAshishSingh1301
 
Scouring of cotton and wool fabric with effective scouring method
Scouring of cotton and wool fabric with effective scouring methodScouring of cotton and wool fabric with effective scouring method
Scouring of cotton and wool fabric with effective scouring methodvimal412355
 
Fundamentals of Structure in C Programming
Fundamentals of Structure in C ProgrammingFundamentals of Structure in C Programming
Fundamentals of Structure in C ProgrammingChandrakantDivate1
 
Passive Air Cooling System and Solar Water Heater.ppt
Passive Air Cooling System and Solar Water Heater.pptPassive Air Cooling System and Solar Water Heater.ppt
Passive Air Cooling System and Solar Water Heater.pptamrabdallah9
 
Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Ramkumar k
 
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdf
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdfInstruct Nirmaana 24-Smart and Lean Construction Through Technology.pdf
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdfEr.Sonali Nasikkar
 

Recently uploaded (20)

Augmented Reality (AR) with Augin Software.pptx
Augmented Reality (AR) with Augin Software.pptxAugmented Reality (AR) with Augin Software.pptx
Augmented Reality (AR) with Augin Software.pptx
 
Databricks Generative AI Fundamentals .pdf
Databricks Generative AI Fundamentals  .pdfDatabricks Generative AI Fundamentals  .pdf
Databricks Generative AI Fundamentals .pdf
 
Working Principle of Echo Sounder and Doppler Effect.pdf
Working Principle of Echo Sounder and Doppler Effect.pdfWorking Principle of Echo Sounder and Doppler Effect.pdf
Working Principle of Echo Sounder and Doppler Effect.pdf
 
Overview of Transformation in Computer Graphics
Overview of Transformation in Computer GraphicsOverview of Transformation in Computer Graphics
Overview of Transformation in Computer Graphics
 
Computer Graphics Introduction To Curves
Computer Graphics Introduction To CurvesComputer Graphics Introduction To Curves
Computer Graphics Introduction To Curves
 
Study of Computer Hardware System using Block Diagram
Study of Computer Hardware System using Block DiagramStudy of Computer Hardware System using Block Diagram
Study of Computer Hardware System using Block Diagram
 
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
 
Convergence of Robotics and Gen AI offers excellent opportunities for Entrepr...
Convergence of Robotics and Gen AI offers excellent opportunities for Entrepr...Convergence of Robotics and Gen AI offers excellent opportunities for Entrepr...
Convergence of Robotics and Gen AI offers excellent opportunities for Entrepr...
 
Autodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptxAutodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptx
 
Dynamo Scripts for Task IDs and Space Naming.pptx
Dynamo Scripts for Task IDs and Space Naming.pptxDynamo Scripts for Task IDs and Space Naming.pptx
Dynamo Scripts for Task IDs and Space Naming.pptx
 
Independent Solar-Powered Electric Vehicle Charging Station
Independent Solar-Powered Electric Vehicle Charging StationIndependent Solar-Powered Electric Vehicle Charging Station
Independent Solar-Powered Electric Vehicle Charging Station
 
01-vogelsanger-stanag-4178-ed-2-the-new-nato-standard-for-nitrocellulose-test...
01-vogelsanger-stanag-4178-ed-2-the-new-nato-standard-for-nitrocellulose-test...01-vogelsanger-stanag-4178-ed-2-the-new-nato-standard-for-nitrocellulose-test...
01-vogelsanger-stanag-4178-ed-2-the-new-nato-standard-for-nitrocellulose-test...
 
Introduction-to- Metrology and Quality.pptx
Introduction-to- Metrology and Quality.pptxIntroduction-to- Metrology and Quality.pptx
Introduction-to- Metrology and Quality.pptx
 
Path loss model, OKUMURA Model, Hata Model
Path loss model, OKUMURA Model, Hata ModelPath loss model, OKUMURA Model, Hata Model
Path loss model, OKUMURA Model, Hata Model
 
handbook on reinforce concrete and detailing
handbook on reinforce concrete and detailinghandbook on reinforce concrete and detailing
handbook on reinforce concrete and detailing
 
Scouring of cotton and wool fabric with effective scouring method
Scouring of cotton and wool fabric with effective scouring methodScouring of cotton and wool fabric with effective scouring method
Scouring of cotton and wool fabric with effective scouring method
 
Fundamentals of Structure in C Programming
Fundamentals of Structure in C ProgrammingFundamentals of Structure in C Programming
Fundamentals of Structure in C Programming
 
Passive Air Cooling System and Solar Water Heater.ppt
Passive Air Cooling System and Solar Water Heater.pptPassive Air Cooling System and Solar Water Heater.ppt
Passive Air Cooling System and Solar Water Heater.ppt
 
Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)
 
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdf
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdfInstruct Nirmaana 24-Smart and Lean Construction Through Technology.pdf
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdf
 

1web click stream.pptx

  • 1. WEB Click Stream Analysis
  • 2. Content • Objective • Description of the data • Exploratory data analysis • Model building – Sequence rules – Link analysis – Probabilistic expert systems – Markov chains • Model comparison • Summary report
  • 3. Introduction • Visitor behaviour on a website can be predicted by analysing existing data on the order in which the site’s webpages are visited • Click flow is captured • Every click of the mouse corresponds to the viewing of a webpage. • Clickstream as the sequence of webpages requested
  • 4. Objective • To understand the most likely navigation paths in a website, with the aim of • predicting, possibly online, which pages a visitor will view, given the path they • have taken so far. • Thisis be very useful in finding the probability that a visitor will view a certain page, perhaps a buying page in an e-commerce site. • It can also find the probability of entering (or exiting) the website from any particular page. • Note that since most pages are now dynamically generated, the idea of viewing a particular page may need to be replaced with the idea of viewing a particular class of page, or type of page; a class could be defined by meta information in the header
  • 5. Description of the data • Data set is present in log file
  • 6. • A log file for a period of about two years, 30 September 1997 to 30 June 1999. • This data set contains the userid (c value), a variable with the date and the instant the visitor has linked to a specific page (c time) and the webpage seen (c caller)
  • 7. • Data set contains 250 711 observations, each corresponding to a click, that describe the navigation paths of 22 527 visitors among the 36 pages which compose the site of the webshop. • The visitors are taken as unique; that is, no visitors appears with more than one session. But a page can occur more than once in the same session. This data set is an example of a transaction dataset.
  • 8.
  • 10.
  • 11.
  • 12. • Box plot of start
  • 13.
  • 14.
  • 15. further reduction in the number of clusters leads to a noticeable decrease in R2 and an increase in SPRSQ. This can be seen in Figure 8.3, which plots R2 and SPRSQ versus the number of groups in the hierarchical agglomerative algorithm.
  • 16. Model building • Sequence rules • Link analysis • Probabilistic expert systems • Markov chains
  • 18. Link analysis • Take the results from the sequence rules and use link analysis to build up a global model. • Consider all indirect sequences of any order up to a maximum of 10. • Link analysis considers each of the obtained sequences as a row • Observation in a data set called link. It then counts how many of the observations include a certain sequence. This is called the count of a sequence and is the fundamental measure for link analysis.