SlideShare a Scribd company logo
Personal Web Usage Mining
Mining client side Web Usage Data
Web Usage Mining
The discovery of patterns in the browsing and navigation
data of Web users.
Web usage mining has been an important technology for
understanding user’s behaviors on the Web.
Currently, most Web usage mining research has been
focusing on the Web server side.
The main purpose of research is to improve a Web site’s
service and the server’s performance.
Data sources for Web usage mining are primarily Web
server logs. Although it is very important and interesting
to investigate server side issues, we argue that an
equally important and potentially fruitful aspect of Web
usage mining is the mining of client side usage data.
Web Usage Mining on Server Side
Currently, Web usage mining finds patterns in Web server
logs. The logs are preprocessed to group requests from
the same user into sessions.
A session contains the requests from a single visit of a
user to the Web site. During the preprocessing, irrelevant
information for Web usage mining such as background
images and unsuccessful requests is ignored. The users
are identified by the IP addresses in the log and all
requests from the same IP address within a certain time-
window are put into a session.
Different heuristics have been developed to deal with the
inaccuracy due to caching, IP sharing or blocking, and
network congestion.
Web Usage Mining on Server Side
Some common characteristics
• Their goal is to improve Web services and performance
Through the improvement of Web sites, including their
contents, structure, presentation, and delivery.
• They focus on the mining of server side data. Their data
sources are almost exclusively server logs, sometimes with
site structure and/or page contents.
• They target groups of users instead of individual users. It is
overwhelming for a Web site to deal with users on an
individual basis.
Personal Web Usage Mining
Individual’s Web usage, rather than group behaviors.
By looking into a user’s Web usage data, we hope to
understand the user’s interests, behaviors, and
preferences.
In other words, we are building the user’s Web profile.
We call this personal Web usage mining since it focuses
on personal Web usage.
Personal Web Usage Mining
Some of the reasons we advocate personal Web usage
mining are as follows.
• The goal of personal Web usage mining is to help and
enhance individual users Web use. It intends to make the
Web easier to use from a single user’s point of view.
• Client side data provide a more accurate and complete
picture of a user’s Web activities.
• We can achieve true individualism and personalization.
● Users have full control of what, when, and how their
data can be used for mining.
● Personal Web usage has increased significantly
recently,
Personal Web Usage Mining
Some researchers are building intelligent agents or
Internet agents that will help individuals use the Web. For
example, many agents were built for information filtering
and gathering on the Web.
WARREN is a multi-agent system for compiling financial
information.
WEBMATE edits a personal newpaper.
WebSifter is a meta-search agent which uses taxonomy
to improve search on the Web.
Other examples include home page finder , user
interface learning agent, and Web browsing assistant.
Although some aspects and pieces of personal Web
usage mining may be around in various areas such as
intelligent agent, Web warehousing, and Web usage
mining,
Personal Web Usage Mining
Two kinds of user Web activities are recorded for
analysis:
● The remote activities
include requests sent by a user to a Web server. Such
kind of click stream data includes the URLs of pages as
well as any keywords, queries, forms, and cookies sent
with the URL.
The remote activities can be captured by almost all Web
browsers. Besides, the browsers also cache the Web pages
in most cases.
Personal Web Usage Mining
Two kinds of user Web activities are recorded for
analysis:
● The local activities
include actions the user can take at his or her desktop
without the knowledge of Web servers. They include,
but are not limited to, the following.
Save a page, Print a page, Click Back on browser, Click Forward on
browser, Click Reload on browser, Click Stop on browser, Email a
link/page, Add a bookmark, Minimize/maximize/close window, Change
visual settings such as font size.
The local activities can be recorded by an activity recorder, which is a
client side program running on top of the browser.
Personal Web Usage Mining
These two kinds of activities are put together into an
activity log. Each entry in the activity log will contain a
timestamp and an activity. Some will contain extra
information such as URL, cache address,keyword,
cookie, email address, and font size.
The schema of the log looks like this:
(timestamp, activity, [URL], [cache address],[keyword],
[cookie], [email address], [other optional fields])
Personal Web Usage Mining
There are four major modules in the framework:
● Logging
● Data Warehousing
● Data mining
● Tool/Application.
Personal Web Usage Mining
In the logging module, user Web activities are stored
into the activity, as well as the cached pages.
In the data warehousing module, the logs and cached
pages are cleansed, extracted, transformed, aggregated,
and stored in a data warehouse. The data warehouse will
facilitate search, query, and OLAP operations, in the
mean time providing data sources for mining.
In the data mining module, various data mining
algorithms are applied to the data in the data warehouse,
whose findings will be used by the tools and applications
in the tool/application module.
Personal web usage mining

More Related Content

What's hot

Ofc216 Shah German Webcms
Ofc216 Shah German WebcmsOfc216 Shah German Webcms
Ofc216 Shah German WebcmsPunk Rock
 
What’s new in share point 2013
What’s new in share point 2013What’s new in share point 2013
What’s new in share point 2013
Wael Sharba
 
Web 2.0 and Depository Web Sites: A Winning Combination (FDLP Version)
Web 2.0 and Depository Web Sites: A Winning Combination (FDLP Version)Web 2.0 and Depository Web Sites: A Winning Combination (FDLP Version)
Web 2.0 and Depository Web Sites: A Winning Combination (FDLP Version)
Peter Thayer
 
SPCA2013 - Best Practices Document Management in SharePoint (Online) 2013
SPCA2013 - Best Practices Document Management in SharePoint (Online) 2013SPCA2013 - Best Practices Document Management in SharePoint (Online) 2013
SPCA2013 - Best Practices Document Management in SharePoint (Online) 2013
Jasper Oosterveld
 
Magnolia Innovation Spotlight - DX Summit 2018 - Agile Content Delivery
Magnolia Innovation Spotlight - DX Summit 2018 - Agile Content DeliveryMagnolia Innovation Spotlight - DX Summit 2018 - Agile Content Delivery
Magnolia Innovation Spotlight - DX Summit 2018 - Agile Content Delivery
Salvador Lopez Jr.
 

What's hot (6)

SharePoint And WCM
SharePoint And WCMSharePoint And WCM
SharePoint And WCM
 
Ofc216 Shah German Webcms
Ofc216 Shah German WebcmsOfc216 Shah German Webcms
Ofc216 Shah German Webcms
 
What’s new in share point 2013
What’s new in share point 2013What’s new in share point 2013
What’s new in share point 2013
 
Web 2.0 and Depository Web Sites: A Winning Combination (FDLP Version)
Web 2.0 and Depository Web Sites: A Winning Combination (FDLP Version)Web 2.0 and Depository Web Sites: A Winning Combination (FDLP Version)
Web 2.0 and Depository Web Sites: A Winning Combination (FDLP Version)
 
SPCA2013 - Best Practices Document Management in SharePoint (Online) 2013
SPCA2013 - Best Practices Document Management in SharePoint (Online) 2013SPCA2013 - Best Practices Document Management in SharePoint (Online) 2013
SPCA2013 - Best Practices Document Management in SharePoint (Online) 2013
 
Magnolia Innovation Spotlight - DX Summit 2018 - Agile Content Delivery
Magnolia Innovation Spotlight - DX Summit 2018 - Agile Content DeliveryMagnolia Innovation Spotlight - DX Summit 2018 - Agile Content Delivery
Magnolia Innovation Spotlight - DX Summit 2018 - Agile Content Delivery
 

Similar to Personal web usage mining

Pxc3893553
Pxc3893553Pxc3893553
Pxc3893553
Ouzza Brahim
 
Implementation of Intelligent Web Server Monitoring
Implementation of Intelligent Web Server MonitoringImplementation of Intelligent Web Server Monitoring
Implementation of Intelligent Web Server Monitoring
iosrjce
 
C017231726
C017231726C017231726
C017231726
IOSR Journals
 
COMPARISON ANALYSIS OF WEB USAGE MINING USING PATTERN RECOGNITION TECHNIQUES
COMPARISON ANALYSIS OF WEB USAGE MINING USING PATTERN RECOGNITION TECHNIQUESCOMPARISON ANALYSIS OF WEB USAGE MINING USING PATTERN RECOGNITION TECHNIQUES
COMPARISON ANALYSIS OF WEB USAGE MINING USING PATTERN RECOGNITION TECHNIQUES
IJDKP
 
Identifying the Number of Visitors to improve Website Usability from Educatio...
Identifying the Number of Visitors to improve Website Usability from Educatio...Identifying the Number of Visitors to improve Website Usability from Educatio...
Identifying the Number of Visitors to improve Website Usability from Educatio...
Editor IJCATR
 
Web mining
Web miningWeb mining
Web mining
SwarnaLatha177
 
WEB MINING.
WEB MINING.WEB MINING.
WEB MINING.
Sushil kasar
 
Automatic recommendation for online users using web usage mining
Automatic recommendation for online users using web usage miningAutomatic recommendation for online users using web usage mining
Automatic recommendation for online users using web usage mining
IJMIT JOURNAL
 
Automatic Recommendation for Online Users Using Web Usage Mining
Automatic Recommendation for Online Users Using Web Usage Mining Automatic Recommendation for Online Users Using Web Usage Mining
Automatic Recommendation for Online Users Using Web Usage Mining
IJMIT JOURNAL
 
Web Data mining-A Research area in Web usage mining
Web Data mining-A Research area in Web usage miningWeb Data mining-A Research area in Web usage mining
Web Data mining-A Research area in Web usage mining
IOSR Journals
 
Web Page Recommendation Using Web Mining
Web Page Recommendation Using Web MiningWeb Page Recommendation Using Web Mining
Web Page Recommendation Using Web Mining
IJERA Editor
 
Web Analytics Primer
Web Analytics PrimerWeb Analytics Primer
Web Analytics Primer
Chad Richeson
 
RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING
ijcax
 
RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING
ijcax
 
RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING
ijcax
 
RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING
ijcax
 
RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING
ijcax
 
RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING
ijcax
 

Similar to Personal web usage mining (20)

Pxc3893553
Pxc3893553Pxc3893553
Pxc3893553
 
Implementation of Intelligent Web Server Monitoring
Implementation of Intelligent Web Server MonitoringImplementation of Intelligent Web Server Monitoring
Implementation of Intelligent Web Server Monitoring
 
C017231726
C017231726C017231726
C017231726
 
COMPARISON ANALYSIS OF WEB USAGE MINING USING PATTERN RECOGNITION TECHNIQUES
COMPARISON ANALYSIS OF WEB USAGE MINING USING PATTERN RECOGNITION TECHNIQUESCOMPARISON ANALYSIS OF WEB USAGE MINING USING PATTERN RECOGNITION TECHNIQUES
COMPARISON ANALYSIS OF WEB USAGE MINING USING PATTERN RECOGNITION TECHNIQUES
 
Identifying the Number of Visitors to improve Website Usability from Educatio...
Identifying the Number of Visitors to improve Website Usability from Educatio...Identifying the Number of Visitors to improve Website Usability from Educatio...
Identifying the Number of Visitors to improve Website Usability from Educatio...
 
Web mining
Web miningWeb mining
Web mining
 
Bb31269380
Bb31269380Bb31269380
Bb31269380
 
WEB MINING.
WEB MINING.WEB MINING.
WEB MINING.
 
Automatic recommendation for online users using web usage mining
Automatic recommendation for online users using web usage miningAutomatic recommendation for online users using web usage mining
Automatic recommendation for online users using web usage mining
 
Automatic Recommendation for Online Users Using Web Usage Mining
Automatic Recommendation for Online Users Using Web Usage Mining Automatic Recommendation for Online Users Using Web Usage Mining
Automatic Recommendation for Online Users Using Web Usage Mining
 
Web Data mining-A Research area in Web usage mining
Web Data mining-A Research area in Web usage miningWeb Data mining-A Research area in Web usage mining
Web Data mining-A Research area in Web usage mining
 
Web Page Recommendation Using Web Mining
Web Page Recommendation Using Web MiningWeb Page Recommendation Using Web Mining
Web Page Recommendation Using Web Mining
 
Research Paper
Research PaperResearch Paper
Research Paper
 
Web Analytics Primer
Web Analytics PrimerWeb Analytics Primer
Web Analytics Primer
 
RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING
 
RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING
 
RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING
 
RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING
 
RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING
 
RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING
 

More from Daminda Herath

Data mining
Data miningData mining
Data mining
Daminda Herath
 
Data mining
Data miningData mining
Data mining
Daminda Herath
 
Web mining
Web miningWeb mining
Web mining
Daminda Herath
 
Web content mining
Web content miningWeb content mining
Web content mining
Daminda Herath
 
Personal Web Usage Mining
Personal Web Usage MiningPersonal Web Usage Mining
Personal Web Usage MiningDaminda Herath
 
Social Aspect of the Internet
Social Aspect of the InternetSocial Aspect of the Internet
Social Aspect of the InternetDaminda Herath
 
1. Overview of Distributed Systems
1. Overview of Distributed Systems1. Overview of Distributed Systems
1. Overview of Distributed SystemsDaminda Herath
 

More from Daminda Herath (10)

Data mining
Data miningData mining
Data mining
 
Data mining
Data miningData mining
Data mining
 
Web mining
Web miningWeb mining
Web mining
 
Web content mining
Web content miningWeb content mining
Web content mining
 
Personal Web Usage Mining
Personal Web Usage MiningPersonal Web Usage Mining
Personal Web Usage Mining
 
XML
XMLXML
XML
 
Social Aspect of the Internet
Social Aspect of the InternetSocial Aspect of the Internet
Social Aspect of the Internet
 
Web Content Mining
Web Content MiningWeb Content Mining
Web Content Mining
 
JavaScript Libraries
JavaScript LibrariesJavaScript Libraries
JavaScript Libraries
 
1. Overview of Distributed Systems
1. Overview of Distributed Systems1. Overview of Distributed Systems
1. Overview of Distributed Systems
 

Recently uploaded

PART A. Introduction to Costumer Service
PART A. Introduction to Costumer ServicePART A. Introduction to Costumer Service
PART A. Introduction to Costumer Service
PedroFerreira53928
 
Introduction to Quality Improvement Essentials
Introduction to Quality Improvement EssentialsIntroduction to Quality Improvement Essentials
Introduction to Quality Improvement Essentials
Excellence Foundation for South Sudan
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
Balvir Singh
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
Tamralipta Mahavidyalaya
 
Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
Vikramjit Singh
 
How to Break the cycle of negative Thoughts
How to Break the cycle of negative ThoughtsHow to Break the cycle of negative Thoughts
How to Break the cycle of negative Thoughts
Col Mukteshwar Prasad
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
Delapenabediema
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
MIRIAMSALINAS13
 
Sectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdfSectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdf
Vivekanand Anglo Vedic Academy
 
Template Jadual Bertugas Kelas (Boleh Edit)
Template Jadual Bertugas Kelas (Boleh Edit)Template Jadual Bertugas Kelas (Boleh Edit)
Template Jadual Bertugas Kelas (Boleh Edit)
rosedainty
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
beazzy04
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
EugeneSaldivar
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
EverAndrsGuerraGuerr
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
Anna Sz.
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
siemaillard
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
Celine George
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
Jisc
 

Recently uploaded (20)

PART A. Introduction to Costumer Service
PART A. Introduction to Costumer ServicePART A. Introduction to Costumer Service
PART A. Introduction to Costumer Service
 
Introduction to Quality Improvement Essentials
Introduction to Quality Improvement EssentialsIntroduction to Quality Improvement Essentials
Introduction to Quality Improvement Essentials
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
 
Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
 
How to Break the cycle of negative Thoughts
How to Break the cycle of negative ThoughtsHow to Break the cycle of negative Thoughts
How to Break the cycle of negative Thoughts
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
 
Sectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdfSectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdf
 
Template Jadual Bertugas Kelas (Boleh Edit)
Template Jadual Bertugas Kelas (Boleh Edit)Template Jadual Bertugas Kelas (Boleh Edit)
Template Jadual Bertugas Kelas (Boleh Edit)
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
 

Personal web usage mining

  • 1. Personal Web Usage Mining Mining client side Web Usage Data
  • 2. Web Usage Mining The discovery of patterns in the browsing and navigation data of Web users. Web usage mining has been an important technology for understanding user’s behaviors on the Web. Currently, most Web usage mining research has been focusing on the Web server side. The main purpose of research is to improve a Web site’s service and the server’s performance. Data sources for Web usage mining are primarily Web server logs. Although it is very important and interesting to investigate server side issues, we argue that an equally important and potentially fruitful aspect of Web usage mining is the mining of client side usage data.
  • 3. Web Usage Mining on Server Side Currently, Web usage mining finds patterns in Web server logs. The logs are preprocessed to group requests from the same user into sessions. A session contains the requests from a single visit of a user to the Web site. During the preprocessing, irrelevant information for Web usage mining such as background images and unsuccessful requests is ignored. The users are identified by the IP addresses in the log and all requests from the same IP address within a certain time- window are put into a session. Different heuristics have been developed to deal with the inaccuracy due to caching, IP sharing or blocking, and network congestion.
  • 4. Web Usage Mining on Server Side Some common characteristics • Their goal is to improve Web services and performance Through the improvement of Web sites, including their contents, structure, presentation, and delivery. • They focus on the mining of server side data. Their data sources are almost exclusively server logs, sometimes with site structure and/or page contents. • They target groups of users instead of individual users. It is overwhelming for a Web site to deal with users on an individual basis.
  • 5. Personal Web Usage Mining Individual’s Web usage, rather than group behaviors. By looking into a user’s Web usage data, we hope to understand the user’s interests, behaviors, and preferences. In other words, we are building the user’s Web profile. We call this personal Web usage mining since it focuses on personal Web usage.
  • 6. Personal Web Usage Mining Some of the reasons we advocate personal Web usage mining are as follows. • The goal of personal Web usage mining is to help and enhance individual users Web use. It intends to make the Web easier to use from a single user’s point of view. • Client side data provide a more accurate and complete picture of a user’s Web activities. • We can achieve true individualism and personalization. ● Users have full control of what, when, and how their data can be used for mining. ● Personal Web usage has increased significantly recently,
  • 7. Personal Web Usage Mining Some researchers are building intelligent agents or Internet agents that will help individuals use the Web. For example, many agents were built for information filtering and gathering on the Web. WARREN is a multi-agent system for compiling financial information. WEBMATE edits a personal newpaper. WebSifter is a meta-search agent which uses taxonomy to improve search on the Web. Other examples include home page finder , user interface learning agent, and Web browsing assistant. Although some aspects and pieces of personal Web usage mining may be around in various areas such as intelligent agent, Web warehousing, and Web usage mining,
  • 8. Personal Web Usage Mining Two kinds of user Web activities are recorded for analysis: ● The remote activities include requests sent by a user to a Web server. Such kind of click stream data includes the URLs of pages as well as any keywords, queries, forms, and cookies sent with the URL. The remote activities can be captured by almost all Web browsers. Besides, the browsers also cache the Web pages in most cases.
  • 9. Personal Web Usage Mining Two kinds of user Web activities are recorded for analysis: ● The local activities include actions the user can take at his or her desktop without the knowledge of Web servers. They include, but are not limited to, the following. Save a page, Print a page, Click Back on browser, Click Forward on browser, Click Reload on browser, Click Stop on browser, Email a link/page, Add a bookmark, Minimize/maximize/close window, Change visual settings such as font size. The local activities can be recorded by an activity recorder, which is a client side program running on top of the browser.
  • 10. Personal Web Usage Mining These two kinds of activities are put together into an activity log. Each entry in the activity log will contain a timestamp and an activity. Some will contain extra information such as URL, cache address,keyword, cookie, email address, and font size. The schema of the log looks like this: (timestamp, activity, [URL], [cache address],[keyword], [cookie], [email address], [other optional fields])
  • 11. Personal Web Usage Mining There are four major modules in the framework: ● Logging ● Data Warehousing ● Data mining ● Tool/Application.
  • 12. Personal Web Usage Mining In the logging module, user Web activities are stored into the activity, as well as the cached pages. In the data warehousing module, the logs and cached pages are cleansed, extracted, transformed, aggregated, and stored in a data warehouse. The data warehouse will facilitate search, query, and OLAP operations, in the mean time providing data sources for mining. In the data mining module, various data mining algorithms are applied to the data in the data warehouse, whose findings will be used by the tools and applications in the tool/application module.