SlideShare a Scribd company logo
1 of 19
UNDER THE GUIDANCE OF Ms. Reshma.R.Owhal
Dr. S.F .Sayyad ME(Computer)
Roll No:17MCO004
 Introduction
 Data Collection and Pre-Processing
 Data Modeling for Web Usage Mining
 Discovery and Analysis of Web Usage
Patterns
 Conclusions
 References
 Web usage mining
– can be broadly defined as discovery and analysis
useful information from the WWW.
– automatic discovery of patterns in clickstreams and
associated data, collected or generated as a result of user
interactions with one or more Web sites.
 Goal: analyze the behavioral patterns and profiles of
users interacting with a Web site.
 This is important in Web usage mining due to the
characteristics of clickstream data.
 This process is critical to the successful extraction of useful
patterns from the data.
 The process may involve pre-processing the original data,is a
process known as data preparation.
 Data cleaning
– remove irrelevant references and fields in server
logs
– remove references due to spider/robot navigation
– add missing references due to caching (done after
sessionization)
 Data fusion/integration
– synchronize data from multiple server logs
– integrate e-commerce and application server data
– integrate meta-data (e.g., content labels)
Data transformation
– user identification
– sessionization
– pageview identification
• a pageview is a set of page files and associated
objects that contribute to a single display in a Web Browser
Data Reduction
– sampling and dimensionality reduction (ignoring certain
pageviews / items)
 Identifying User Transactions
– i.e., sets or sequences of pageviews possibly with
associated weights
Sessionization (Identify sessions )
-It is the process of segmenting the user activity record of
each user into sessions, each representing a single visit to the site.
-The goal of a sessionization heuristic is to reconstruct, from
the clickstream data, the actual sequence of actions performed by
one user during one visit to the site
Difficult to obtain reliable usage data due to
– proxy servers
– dynamic IP addresses,
– the inability of servers.
Pageview identification
– Depends on the intra-page structure of sites
– Identify the collection of Web files representing a specific “user
event” corresponding to a clickthrough (e.g. viewing a product page, adding a
product to a shopping cart)
– e.g like the purchase of a product on an online ecommerce Site
User Identification
– The analysis of Web usage does not require knowledge about a
user’s identity. So it is necessary to distinguish among different users.
– Since a user may visit a site more than once, the server logs record
multiple sessions for each user.
Path completion
-Client- or proxy-side caching can often result in missing
access references to those pages or objects that have been cached.
- For instance,
– if a user goes back to a page A during the same session, the
second access to A will likely result in viewing the previously
downloaded version of A that was cached on the client-side, and
therefore, no request is made to the server.
– This results in the second reference to A not being
recorded on the server logs.
 The discovered patterns: usually represented as
– collections of pages, objects, or resources that are
frequently accessed by groups of users with
common interests.
 Decision Trees
◦ a flow chart of questions leading to a decision
◦ Ex: car buying decision tree
 Path Analysis
◦ Uses Graph Model
◦ Provide insights to navigational problems
◦ Example of info. Discovered by Path analysis:
 78% “company”-> “what’s new”->“sample”-> “order”
 60% left sites after 4 or less page references
=> most important info must be within the first 4 pages of site entry
points.
 Grouping
◦ Groups similar info. to help draw higher-level conclusions
◦ Ex: all URLs containing the word “Yahoo”…
 Filtering
◦ Allows to answer specific questions like:
 how many visitors to the site in this week?
 Cookies
◦ Randomly assigned ID by web server to browser
◦ Cookies are beneficial to both web site developers and visitors
◦ Cookie field entry in log file can be used by Web traffic analysis
software to track repeat visitors  loyal customers.
 Association Rules
◦ help find spending patterns on related products
◦ 30% who accessed/company/products/bread.html, also accessed
/company/products/milk.htm.
 Sequential Patterns
◦ help find inter-transaction patterns
◦ 50% who bought items in /pcworld/computers/, also bought in
/pcworld/accessories/ within 15 days
 Clustering
◦ Identifies visitors with common characteristics based on visitors’ profiles
◦ One straightforward approach in creating an aggregate view of each
cluster is to compute the centroid of each cluster.
◦ 50% who applied discover platinum card in
/discovercard/customerService/newcard, were in the 25-35 age group,
with annual income between $40,000 – 50,000.
 Web Mining support on-going, continuous improvements for E-
businesses
 Web usage and data mining to find patterns is a growing area with the
growth of Web-based applications
 Application of web usage data can be used to better understand web
usage, and apply this specific knowledge to better serve users
 Web usage patterns and data mining can be the basis for a great deal
of future research
 Web Usage Mining from Bing Liu. “Web Data Mining: Exploring
Hyperlinks, Contents, and Usage Data”, Springer Chapter written by
Bamshad Mobasher.
 Web Usage Mining-What, Why, hoW Presented by : Roopa Datla ,
Jinguang Liu.
 Web Usage Mining: Discovery and Applications of Usage Patterns
from Web Data Srivastava J., Cooley R., Deshpande M, Tan
P.N.Appeared in SIGKDD Explorations, Vol. 1, Issue 2, 2000.
 Web Usage Mining: Processes and Applications Qiaoyuan Jiang CSE
8331 November 24, 2003.
Thank you…..

More Related Content

What's hot

What's hot (20)

Web mining
Web miningWeb mining
Web mining
 
Web Content Mining
Web Content MiningWeb Content Mining
Web Content Mining
 
Webmining ppt
Webmining pptWebmining ppt
Webmining ppt
 
Web content mining
Web content miningWeb content mining
Web content mining
 
Web mining
Web mining Web mining
Web mining
 
Web mining slides
Web mining slidesWeb mining slides
Web mining slides
 
Web Mining Presentation Final
Web Mining Presentation FinalWeb Mining Presentation Final
Web Mining Presentation Final
 
Web Mining
Web MiningWeb Mining
Web Mining
 
Web Usage Mining: A Survey on User's Navigation Pattern from Web Logs
Web Usage Mining: A Survey on User's Navigation Pattern from Web LogsWeb Usage Mining: A Survey on User's Navigation Pattern from Web Logs
Web Usage Mining: A Survey on User's Navigation Pattern from Web Logs
 
Web mining tools
Web mining toolsWeb mining tools
Web mining tools
 
Web mining
Web miningWeb mining
Web mining
 
Discovering knowledge using web structure mining
Discovering knowledge using web structure miningDiscovering knowledge using web structure mining
Discovering knowledge using web structure mining
 
Web mining
Web miningWeb mining
Web mining
 
Preprocessing of Web Log Data for Web Usage Mining
Preprocessing of Web Log Data for Web Usage MiningPreprocessing of Web Log Data for Web Usage Mining
Preprocessing of Web Log Data for Web Usage Mining
 
Web mining
Web miningWeb mining
Web mining
 
Web mining
Web miningWeb mining
Web mining
 
Web mining (1)
Web mining (1)Web mining (1)
Web mining (1)
 
Web mining (structure mining)
Web mining (structure mining)Web mining (structure mining)
Web mining (structure mining)
 
5463 26 web mining
5463 26 web mining5463 26 web mining
5463 26 web mining
 
Web mining
Web miningWeb mining
Web mining
 

Similar to Web usage mining

Implementation of Intelligent Web Server Monitoring
Implementation of Intelligent Web Server MonitoringImplementation of Intelligent Web Server Monitoring
Implementation of Intelligent Web Server Monitoringiosrjce
 
Clickstream Analysis
Clickstream AnalysisClickstream Analysis
Clickstream Analysisintuitiv.de
 
applyingwebminingapplicationforuserbehaviorunderstanding-131215105223-phpapp0...
applyingwebminingapplicationforuserbehaviorunderstanding-131215105223-phpapp0...applyingwebminingapplicationforuserbehaviorunderstanding-131215105223-phpapp0...
applyingwebminingapplicationforuserbehaviorunderstanding-131215105223-phpapp0...Zakaria Zubi
 
Web analytics white paper Quiterian
Web analytics white paper QuiterianWeb analytics white paper Quiterian
Web analytics white paper QuiterianJosep Arroyo
 
Ecommerce by bhawani nandan prasad
Ecommerce by bhawani nandan prasadEcommerce by bhawani nandan prasad
Ecommerce by bhawani nandan prasadBhawani N Prasad
 
IRJET - Re-Ranking of Google Search Results
IRJET - Re-Ranking of Google Search ResultsIRJET - Re-Ranking of Google Search Results
IRJET - Re-Ranking of Google Search ResultsIRJET Journal
 
Web Data mining-A Research area in Web usage mining
Web Data mining-A Research area in Web usage miningWeb Data mining-A Research area in Web usage mining
Web Data mining-A Research area in Web usage miningIOSR Journals
 
Performance of Real Time Web Traffic Analysis Using Feed Forward Neural Netw...
Performance of Real Time Web Traffic Analysis Using Feed  Forward Neural Netw...Performance of Real Time Web Traffic Analysis Using Feed  Forward Neural Netw...
Performance of Real Time Web Traffic Analysis Using Feed Forward Neural Netw...IOSR Journals
 
Web mining and social media mining
Web mining and social media miningWeb mining and social media mining
Web mining and social media miningRoxana Tadayon
 
An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...
An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...
An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...IJSRD
 
Applying web mining application for user behavior understanding
Applying web mining application for user behavior understandingApplying web mining application for user behavior understanding
Applying web mining application for user behavior understandingZakaria Zubi
 
Certain Issues in Web Page Prediction, Classification and Clustering in Data ...
Certain Issues in Web Page Prediction, Classification and Clustering in Data ...Certain Issues in Web Page Prediction, Classification and Clustering in Data ...
Certain Issues in Web Page Prediction, Classification and Clustering in Data ...IJAEMSJORNAL
 
Automatic recommendation for online users using web usage mining
Automatic recommendation for online users using web usage miningAutomatic recommendation for online users using web usage mining
Automatic recommendation for online users using web usage miningIJMIT JOURNAL
 
Automatic Recommendation for Online Users Using Web Usage Mining
Automatic Recommendation for Online Users Using Web Usage Mining Automatic Recommendation for Online Users Using Web Usage Mining
Automatic Recommendation for Online Users Using Web Usage Mining IJMIT JOURNAL
 
2000-08.doc
2000-08.doc2000-08.doc
2000-08.docbutest
 

Similar to Web usage mining (20)

clickstream analysis
 clickstream analysis clickstream analysis
clickstream analysis
 
Implementation of Intelligent Web Server Monitoring
Implementation of Intelligent Web Server MonitoringImplementation of Intelligent Web Server Monitoring
Implementation of Intelligent Web Server Monitoring
 
C017231726
C017231726C017231726
C017231726
 
Pxc3893553
Pxc3893553Pxc3893553
Pxc3893553
 
Clickstream Analysis
Clickstream AnalysisClickstream Analysis
Clickstream Analysis
 
applyingwebminingapplicationforuserbehaviorunderstanding-131215105223-phpapp0...
applyingwebminingapplicationforuserbehaviorunderstanding-131215105223-phpapp0...applyingwebminingapplicationforuserbehaviorunderstanding-131215105223-phpapp0...
applyingwebminingapplicationforuserbehaviorunderstanding-131215105223-phpapp0...
 
Web analytics white paper Quiterian
Web analytics white paper QuiterianWeb analytics white paper Quiterian
Web analytics white paper Quiterian
 
Ecommerce by bhawani nandan prasad
Ecommerce by bhawani nandan prasadEcommerce by bhawani nandan prasad
Ecommerce by bhawani nandan prasad
 
IRJET - Re-Ranking of Google Search Results
IRJET - Re-Ranking of Google Search ResultsIRJET - Re-Ranking of Google Search Results
IRJET - Re-Ranking of Google Search Results
 
Web Data mining-A Research area in Web usage mining
Web Data mining-A Research area in Web usage miningWeb Data mining-A Research area in Web usage mining
Web Data mining-A Research area in Web usage mining
 
Performance of Real Time Web Traffic Analysis Using Feed Forward Neural Netw...
Performance of Real Time Web Traffic Analysis Using Feed  Forward Neural Netw...Performance of Real Time Web Traffic Analysis Using Feed  Forward Neural Netw...
Performance of Real Time Web Traffic Analysis Using Feed Forward Neural Netw...
 
Web mining and social media mining
Web mining and social media miningWeb mining and social media mining
Web mining and social media mining
 
Web
WebWeb
Web
 
An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...
An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...
An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...
 
Applying web mining application for user behavior understanding
Applying web mining application for user behavior understandingApplying web mining application for user behavior understanding
Applying web mining application for user behavior understanding
 
Certain Issues in Web Page Prediction, Classification and Clustering in Data ...
Certain Issues in Web Page Prediction, Classification and Clustering in Data ...Certain Issues in Web Page Prediction, Classification and Clustering in Data ...
Certain Issues in Web Page Prediction, Classification and Clustering in Data ...
 
Web Mining
Web Mining Web Mining
Web Mining
 
Automatic recommendation for online users using web usage mining
Automatic recommendation for online users using web usage miningAutomatic recommendation for online users using web usage mining
Automatic recommendation for online users using web usage mining
 
Automatic Recommendation for Online Users Using Web Usage Mining
Automatic Recommendation for Online Users Using Web Usage Mining Automatic Recommendation for Online Users Using Web Usage Mining
Automatic Recommendation for Online Users Using Web Usage Mining
 
2000-08.doc
2000-08.doc2000-08.doc
2000-08.doc
 

Recently uploaded

Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.arsicmarija21
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 

Recently uploaded (20)

Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 

Web usage mining

  • 1. UNDER THE GUIDANCE OF Ms. Reshma.R.Owhal Dr. S.F .Sayyad ME(Computer) Roll No:17MCO004
  • 2.  Introduction  Data Collection and Pre-Processing  Data Modeling for Web Usage Mining  Discovery and Analysis of Web Usage Patterns  Conclusions  References
  • 3.  Web usage mining – can be broadly defined as discovery and analysis useful information from the WWW. – automatic discovery of patterns in clickstreams and associated data, collected or generated as a result of user interactions with one or more Web sites.  Goal: analyze the behavioral patterns and profiles of users interacting with a Web site.
  • 4.
  • 5.  This is important in Web usage mining due to the characteristics of clickstream data.  This process is critical to the successful extraction of useful patterns from the data.  The process may involve pre-processing the original data,is a process known as data preparation.
  • 6.
  • 7.  Data cleaning – remove irrelevant references and fields in server logs – remove references due to spider/robot navigation – add missing references due to caching (done after sessionization)  Data fusion/integration – synchronize data from multiple server logs – integrate e-commerce and application server data – integrate meta-data (e.g., content labels)
  • 8. Data transformation – user identification – sessionization – pageview identification • a pageview is a set of page files and associated objects that contribute to a single display in a Web Browser Data Reduction – sampling and dimensionality reduction (ignoring certain pageviews / items)  Identifying User Transactions – i.e., sets or sequences of pageviews possibly with associated weights
  • 9. Sessionization (Identify sessions ) -It is the process of segmenting the user activity record of each user into sessions, each representing a single visit to the site. -The goal of a sessionization heuristic is to reconstruct, from the clickstream data, the actual sequence of actions performed by one user during one visit to the site Difficult to obtain reliable usage data due to – proxy servers – dynamic IP addresses, – the inability of servers.
  • 10. Pageview identification – Depends on the intra-page structure of sites – Identify the collection of Web files representing a specific “user event” corresponding to a clickthrough (e.g. viewing a product page, adding a product to a shopping cart) – e.g like the purchase of a product on an online ecommerce Site User Identification – The analysis of Web usage does not require knowledge about a user’s identity. So it is necessary to distinguish among different users. – Since a user may visit a site more than once, the server logs record multiple sessions for each user.
  • 11. Path completion -Client- or proxy-side caching can often result in missing access references to those pages or objects that have been cached. - For instance, – if a user goes back to a page A during the same session, the second access to A will likely result in viewing the previously downloaded version of A that was cached on the client-side, and therefore, no request is made to the server. – This results in the second reference to A not being recorded on the server logs.
  • 12.
  • 13.  The discovered patterns: usually represented as – collections of pages, objects, or resources that are frequently accessed by groups of users with common interests.
  • 14.  Decision Trees ◦ a flow chart of questions leading to a decision ◦ Ex: car buying decision tree  Path Analysis ◦ Uses Graph Model ◦ Provide insights to navigational problems ◦ Example of info. Discovered by Path analysis:  78% “company”-> “what’s new”->“sample”-> “order”  60% left sites after 4 or less page references => most important info must be within the first 4 pages of site entry points.
  • 15.  Grouping ◦ Groups similar info. to help draw higher-level conclusions ◦ Ex: all URLs containing the word “Yahoo”…  Filtering ◦ Allows to answer specific questions like:  how many visitors to the site in this week?  Cookies ◦ Randomly assigned ID by web server to browser ◦ Cookies are beneficial to both web site developers and visitors ◦ Cookie field entry in log file can be used by Web traffic analysis software to track repeat visitors  loyal customers.
  • 16.  Association Rules ◦ help find spending patterns on related products ◦ 30% who accessed/company/products/bread.html, also accessed /company/products/milk.htm.  Sequential Patterns ◦ help find inter-transaction patterns ◦ 50% who bought items in /pcworld/computers/, also bought in /pcworld/accessories/ within 15 days  Clustering ◦ Identifies visitors with common characteristics based on visitors’ profiles ◦ One straightforward approach in creating an aggregate view of each cluster is to compute the centroid of each cluster. ◦ 50% who applied discover platinum card in /discovercard/customerService/newcard, were in the 25-35 age group, with annual income between $40,000 – 50,000.
  • 17.  Web Mining support on-going, continuous improvements for E- businesses  Web usage and data mining to find patterns is a growing area with the growth of Web-based applications  Application of web usage data can be used to better understand web usage, and apply this specific knowledge to better serve users  Web usage patterns and data mining can be the basis for a great deal of future research
  • 18.  Web Usage Mining from Bing Liu. “Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data”, Springer Chapter written by Bamshad Mobasher.  Web Usage Mining-What, Why, hoW Presented by : Roopa Datla , Jinguang Liu.  Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data Srivastava J., Cooley R., Deshpande M, Tan P.N.Appeared in SIGKDD Explorations, Vol. 1, Issue 2, 2000.  Web Usage Mining: Processes and Applications Qiaoyuan Jiang CSE 8331 November 24, 2003.