SlideShare a Scribd company logo
APPLYING WEB MINING
APPLICATION FOR USER
BEHAVIOR UNDERSTANDING

Dr. Zakaria Suliman Zubi
Associate Professor
Computer Science Department
Faculty Of Science
Sirte University, Libya
LOGO
LOGO

Contents
LOGO

Abstract

Web usage mining (WUM) focuses on the discovering of potential knowledge from
browsing patterns of the users. Which leads us to find the correlation between pages in the
analysis stage.
The primary data source used in web usage mining is the server log-files (web-logs).
 Browsing web pages by the user leaves a lot of information in the log-file. Analyzing logfiles information drives us to understand the behavior of the user.
Web log is an essential part for the web mining to extract usage patterns and study the
visiting characteristics of user.
Our paper focus on the use of web mining techniques to classify web pages type according
to user visits.
 This classification helps us to understand the user behavior.
We also uses some classification and association rule techniques for discovering the
potential knowledge from the browsing patterns.
LOGO

Contents
LOGO

INTRODUCTION

The Internet offers a huge, widely global information center for
News, advertising, consume information, financial management,
education, government, and e-commerce .
The aim of using web mining techniques for understanding user
behavior is to profile user characteristics.
Web mining can be organized into three main categories: web
content mining, web structure mining, and web usage mining.
LOGO

INTRODUCTION
Cont..
Web Mining

Web Structure Mining

Web Content Mining

Web Usage Mining

1-Web content mining analyzes web content such as text,
multimedia data, and structured data (within web pages or linked
across web pages).
2 -Web structure mining is the process of using graph and
network mining theory and methods to analyze the nodes and
connection structures on the Web.
3- Web Usage Mining is a special type of web mining tool, which
can discover the knowledge in the hidden browsing patterns and
analyses the visiting characteristics of the users.
LOGO

INTRODUCTION Cont..
The Primary Data of Web Usage Mining
1-Web server logs .
2-Data about visitors of the sites.
3-Registration forms.

Fig 2:portion of a typical server log
A standard log-file had the following format
remotehost; logname; username; date; request; status; bytes[ where:
remotehost: is the remote hostname or its IP address;
logname:is the remote log name of the user;
username: is the username with which the user has authenticated himself,
date: is the date and time of the request,
request: is the exact request line as it came from the client,
status: is the HTTP status code returned to the client, and
bytes: is the content-length of the document transferred.
LOGO

Contents
LOGO

THE PHASES OF WEB USAGE MINING

Web usage mining is a complete process that
includes various stages of data mining cycle, including
Data Preprocessing, Pattern Discovery & Pattern
Analysis.
 Initially, at the data preprocessing stage web log is
preprocessed to clean, integrate and transform into a
common log.
In the pattern discovery: Data mining techniques
are applied to discover the interesting characteristics
in the hidden patterns.
Pattern Analysis is the final stage of web usage
mining which can validate interested patterns from the
output of pattern discovery that can be used to predict
user behavior.
LOGO THE PHASES OF WEB USAGE MINING
Data Preprocessing Process
Data Cleaning:
The log-file is first examined to remove
irrelevant entries such as those that represent
multimedia data and scripts or uninteresting
entries such as those that belongs to
top/bottom frames.
Pageview Identification:
Identification of
page views is heavily
dependent on the intra-page structure of the
site, as well as on the page contents and the
underlying site do-main knowledge. each
pageview can be viewed as a collection of
Web objects or resources representing a
specific “user event,”.

Data
Cleaning

Pageview
Identification

User
Identification

Session
Identification
LOGO THE PHASES OF WEB USAGE MINING
Data Preprocessing Process
User Identification:
Since several users may share a single
machine name, certain heuristics are
used to identify users . We use the
phrase user activity record to refer to the
sequence of logged activities belonging
to the same user.
Session Identification:
 Aims to split the page access of each
user into separated sessions. It defines
the number of times the user has
accessed a web page and time out
defines a time limit for the access of
particular web page for more than 30
minutes if more the session will be
divided in more than one session.

Sample of user and sessions identification
LOGO THE PHASES OF WEB USAGE MINING
Pattern Discovery Process:
Discovering user access pattern from the user access log files is the main
purpose of using web usage mining .

Association Rule Mining:
Association rule mining discovery and statistical correlation analysis can
find groups of web pages types that are commonly accessed together
(Association rule mining can be used to discover correlation between pages
types found in a web log) this technique is applied to user and session
identification consisting of item where every item represents a page type ,we
will also use Apriori algorithm to find the correlation between pages based on
the confidence and support vectors.
What are the set of pages type frequently accessed together by the web users.
e.g
(Sport, News, Social)
What the page type will be fetched next.
e.g
Entertainment
LOGO THE PHASES OF WEB USAGE MINING
Classification
Classification techniques play an important role in Web analytics
applications for modeling the users according to various predefined
metrics.
In the Web domain, we are interested in developing a profile of users
belonging to a particular class or category . This requires extraction and
selection of features that best describe the properties of a given class or
category.
We will focus also on k-nearest neighbor (K-NN) which was
considered as a predictive technique for classification models. Whereas;
 k represents a number of similar cases or the number of items in the
group.
LOGO THE PHASES OF WEB USAGE MINING
Pattern Analysis Process:
In this stage of process the discovered patterns will further
processed ,filtered ,possibly resulting in aggregate user models
that can be used as a visualizations tools ,the next figure
summarizes the whole process:
LOGO

Contents
RESULTS OF USING ASSOCIATION RULES
LOGO

Log-file in a flat file format.

Import log-file database to our implemented
application.
RESULTS OF USING ASSOCIATION RULES
LOGO

Extract the transactional database of
web sever log for every user where
every transaction represents a session.

Find the association rules of user
behavior after applying the Aprori
algorithm to the transactional database of
the user.
LOGO

Contents
LOGO

CONCLUSION

 We used web data that contained all the information about the user. When
the user leaves accessing the web pages. This data is called web logs or (serverlogs)
A statistical methods such as classification, association rule mining discovery
and statistical correlation analysis which can find groups of web pages types
that are commonly accessed together are applied as well.
Classification is used to map the data item into one of several predefined
classes. The class will belongs into one category such as sport or politics or
education or..etc. We also uses the k-nearest neighbor (K-NN) algorithm as a
common classification method to select the best class.
Association rule mining was used to discover correlation between sites types
found in a web log.
The implemented application program was designed in C# programming
language.
Any Questions????

LOGO

More Related Content

What's hot

Web Mining Presentation Final
Web Mining Presentation FinalWeb Mining Presentation Final
Web Mining Presentation FinalEr. Jagrat Gupta
 
Web Usage Pattern
Web Usage PatternWeb Usage Pattern
Web Usage Pattern
Shreyansh Kejriwal
 
clickstream analysis
 clickstream analysis clickstream analysis
clickstream analysis
ERSHUBHAM TIWARI
 
Online Payment Fraud Detection with Azure Machine Learning
Online Payment Fraud Detection with Azure Machine LearningOnline Payment Fraud Detection with Azure Machine Learning
Online Payment Fraud Detection with Azure Machine Learning
Stefano Tempesta
 
Web mining
Web miningWeb mining
Web Mining
Web MiningWeb Mining
Web Mining
Ziyad Abid
 
Web mining
Web miningWeb mining
Web mining
Daminda Herath
 
Web mining
Web mining Web mining
Web mining
TeklayBirhane
 
What is web scraping?
What is web scraping?What is web scraping?
What is web scraping?
Brijesh Prajapati
 
Web Scraping and Data Extraction Service
Web Scraping and Data Extraction ServiceWeb Scraping and Data Extraction Service
Web Scraping and Data Extraction Service
PromptCloud
 
What is Web-scraping?
What is Web-scraping?What is Web-scraping?
What is Web-scraping?
Yu-Chang Ho
 
Web mining slides
Web mining slidesWeb mining slides
Web mining slides
mahavir_a
 
Web Scraping With Python
Web Scraping With PythonWeb Scraping With Python
Web Scraping With Python
Robert Dempsey
 
Scrapy
ScrapyScrapy
Web mining
Web miningWeb mining
Web mining
Renusoni8
 
Web scraping & browser automation
Web scraping & browser automationWeb scraping & browser automation
Web scraping & browser automation
BHAWESH RAJPAL
 
Data Mining
Data MiningData Mining
Data Mining
SHIKHA GAUTAM
 
Web mining
Web miningWeb mining
Web mining
SarthakSahoo8
 

What's hot (20)

Web Mining Presentation Final
Web Mining Presentation FinalWeb Mining Presentation Final
Web Mining Presentation Final
 
Web Usage Pattern
Web Usage PatternWeb Usage Pattern
Web Usage Pattern
 
clickstream analysis
 clickstream analysis clickstream analysis
clickstream analysis
 
web mining
web miningweb mining
web mining
 
Online Payment Fraud Detection with Azure Machine Learning
Online Payment Fraud Detection with Azure Machine LearningOnline Payment Fraud Detection with Azure Machine Learning
Online Payment Fraud Detection with Azure Machine Learning
 
Web mining
Web miningWeb mining
Web mining
 
Web Mining
Web MiningWeb Mining
Web Mining
 
Web mining
Web miningWeb mining
Web mining
 
Web mining
Web mining Web mining
Web mining
 
What is web scraping?
What is web scraping?What is web scraping?
What is web scraping?
 
Web Scraping and Data Extraction Service
Web Scraping and Data Extraction ServiceWeb Scraping and Data Extraction Service
Web Scraping and Data Extraction Service
 
What is Web-scraping?
What is Web-scraping?What is Web-scraping?
What is Web-scraping?
 
Web mining slides
Web mining slidesWeb mining slides
Web mining slides
 
Web Scraping With Python
Web Scraping With PythonWeb Scraping With Python
Web Scraping With Python
 
Scrapy
ScrapyScrapy
Scrapy
 
Web mining
Web miningWeb mining
Web mining
 
Introduction data mining
Introduction data miningIntroduction data mining
Introduction data mining
 
Web scraping & browser automation
Web scraping & browser automationWeb scraping & browser automation
Web scraping & browser automation
 
Data Mining
Data MiningData Mining
Data Mining
 
Web mining
Web miningWeb mining
Web mining
 

Viewers also liked

Preprocessing of Web Log Data for Web Usage Mining
Preprocessing of Web Log Data for Web Usage MiningPreprocessing of Web Log Data for Web Usage Mining
Preprocessing of Web Log Data for Web Usage Mining
Amir Masoud Sefidian
 
Advance Clustering Technique Based on Markov Chain for Predicting Next User M...
Advance Clustering Technique Based on Markov Chain for Predicting Next User M...Advance Clustering Technique Based on Markov Chain for Predicting Next User M...
Advance Clustering Technique Based on Markov Chain for Predicting Next User M...
idescitation
 
Knowledge discoverylaurahollink
Knowledge discoverylaurahollinkKnowledge discoverylaurahollink
Knowledge discoverylaurahollink
SSSW
 
Dotnet titles 2016 17
Dotnet titles 2016 17Dotnet titles 2016 17
Dotnet titles 2016 17
praba123456
 
Webmining ppt
Webmining pptWebmining ppt
Webmining ppt
kiransatyawada
 
Spontaneous Combustion
Spontaneous CombustionSpontaneous Combustion
Spontaneous Combustion
Ron Thaman
 
Computer Applications in Mining Engineering, AKS University
Computer Applications in Mining Engineering, AKS UniversityComputer Applications in Mining Engineering, AKS University
Computer Applications in Mining Engineering, AKS University
Prof-GoldSmith Briz
 
03 Haarla ZRI Metal-Mining General Overview (PC) Sept 12_16
03 Haarla ZRI Metal-Mining General Overview (PC) Sept 12_1603 Haarla ZRI Metal-Mining General Overview (PC) Sept 12_16
03 Haarla ZRI Metal-Mining General Overview (PC) Sept 12_16Scott Jobin-Bevans
 
Enviromental conservasion
Enviromental conservasionEnviromental conservasion
Enviromental conservasionvaishali_bansal
 
magmatic deposits - economic geology
magmatic deposits - economic geologymagmatic deposits - economic geology
magmatic deposits - economic geology
Monikonkona Boruah
 
Mine hazards(2162294)
Mine hazards(2162294)Mine hazards(2162294)
Mine hazards(2162294)
Shivam Bambhaniya
 
Knowledge discovery thru data mining
Knowledge discovery thru data miningKnowledge discovery thru data mining
Knowledge discovery thru data mining
Devakumar Jain
 
Economic geology - Magmatic ore deposits_1
Economic geology - Magmatic ore deposits_1Economic geology - Magmatic ore deposits_1
Economic geology - Magmatic ore deposits_1
AbdelMonem Soltan
 
Sublevel stoping..Underground mining methods
Sublevel stoping..Underground mining methodsSublevel stoping..Underground mining methods
Sublevel stoping..Underground mining methods
Geology Department, Faculty of Science, Tanta University
 
NATURAL VENTILATION LITERATURE AND CASE STUDY IN INDIA (DISSERTATION OF THESI...
NATURAL VENTILATION LITERATURE AND CASE STUDY IN INDIA (DISSERTATION OF THESI...NATURAL VENTILATION LITERATURE AND CASE STUDY IN INDIA (DISSERTATION OF THESI...
NATURAL VENTILATION LITERATURE AND CASE STUDY IN INDIA (DISSERTATION OF THESI...
KIRAN DAS VAISHNAV
 
Mining ppt 2014
Mining ppt 2014Mining ppt 2014
Mining ppt 2014
Ajoy Raj Saikia
 

Viewers also liked (20)

Preprocessing of Web Log Data for Web Usage Mining
Preprocessing of Web Log Data for Web Usage MiningPreprocessing of Web Log Data for Web Usage Mining
Preprocessing of Web Log Data for Web Usage Mining
 
Advance Clustering Technique Based on Markov Chain for Predicting Next User M...
Advance Clustering Technique Based on Markov Chain for Predicting Next User M...Advance Clustering Technique Based on Markov Chain for Predicting Next User M...
Advance Clustering Technique Based on Markov Chain for Predicting Next User M...
 
Knowledge discoverylaurahollink
Knowledge discoverylaurahollinkKnowledge discoverylaurahollink
Knowledge discoverylaurahollink
 
Dotnet titles 2016 17
Dotnet titles 2016 17Dotnet titles 2016 17
Dotnet titles 2016 17
 
Webmining ppt
Webmining pptWebmining ppt
Webmining ppt
 
5463 26 web mining
5463 26 web mining5463 26 web mining
5463 26 web mining
 
Spontaneous Combustion
Spontaneous CombustionSpontaneous Combustion
Spontaneous Combustion
 
Computer Applications in Mining Engineering, AKS University
Computer Applications in Mining Engineering, AKS UniversityComputer Applications in Mining Engineering, AKS University
Computer Applications in Mining Engineering, AKS University
 
03 Haarla ZRI Metal-Mining General Overview (PC) Sept 12_16
03 Haarla ZRI Metal-Mining General Overview (PC) Sept 12_1603 Haarla ZRI Metal-Mining General Overview (PC) Sept 12_16
03 Haarla ZRI Metal-Mining General Overview (PC) Sept 12_16
 
Stability
StabilityStability
Stability
 
acid mine drainage
 acid mine drainage acid mine drainage
acid mine drainage
 
Enviromental conservasion
Enviromental conservasionEnviromental conservasion
Enviromental conservasion
 
magmatic deposits - economic geology
magmatic deposits - economic geologymagmatic deposits - economic geology
magmatic deposits - economic geology
 
Mine hazards(2162294)
Mine hazards(2162294)Mine hazards(2162294)
Mine hazards(2162294)
 
Knowledge discovery thru data mining
Knowledge discovery thru data miningKnowledge discovery thru data mining
Knowledge discovery thru data mining
 
Economic geology - Magmatic ore deposits_1
Economic geology - Magmatic ore deposits_1Economic geology - Magmatic ore deposits_1
Economic geology - Magmatic ore deposits_1
 
Sublevel stoping..Underground mining methods
Sublevel stoping..Underground mining methodsSublevel stoping..Underground mining methods
Sublevel stoping..Underground mining methods
 
Mining methods
Mining methodsMining methods
Mining methods
 
NATURAL VENTILATION LITERATURE AND CASE STUDY IN INDIA (DISSERTATION OF THESI...
NATURAL VENTILATION LITERATURE AND CASE STUDY IN INDIA (DISSERTATION OF THESI...NATURAL VENTILATION LITERATURE AND CASE STUDY IN INDIA (DISSERTATION OF THESI...
NATURAL VENTILATION LITERATURE AND CASE STUDY IN INDIA (DISSERTATION OF THESI...
 
Mining ppt 2014
Mining ppt 2014Mining ppt 2014
Mining ppt 2014
 

Similar to Applying web mining application for user behavior understanding

applyingwebminingapplicationforuserbehaviorunderstanding-131215105223-phpapp0...
applyingwebminingapplicationforuserbehaviorunderstanding-131215105223-phpapp0...applyingwebminingapplicationforuserbehaviorunderstanding-131215105223-phpapp0...
applyingwebminingapplicationforuserbehaviorunderstanding-131215105223-phpapp0...
Zakaria Zubi
 
Web Data mining-A Research area in Web usage mining
Web Data mining-A Research area in Web usage miningWeb Data mining-A Research area in Web usage mining
Web Data mining-A Research area in Web usage mining
IOSR Journals
 
COMPARISON ANALYSIS OF WEB USAGE MINING USING PATTERN RECOGNITION TECHNIQUES
COMPARISON ANALYSIS OF WEB USAGE MINING USING PATTERN RECOGNITION TECHNIQUESCOMPARISON ANALYSIS OF WEB USAGE MINING USING PATTERN RECOGNITION TECHNIQUES
COMPARISON ANALYSIS OF WEB USAGE MINING USING PATTERN RECOGNITION TECHNIQUES
IJDKP
 
Identifying the Number of Visitors to improve Website Usability from Educatio...
Identifying the Number of Visitors to improve Website Usability from Educatio...Identifying the Number of Visitors to improve Website Usability from Educatio...
Identifying the Number of Visitors to improve Website Usability from Educatio...
Editor IJCATR
 
Automatic recommendation for online users using web usage mining
Automatic recommendation for online users using web usage miningAutomatic recommendation for online users using web usage mining
Automatic recommendation for online users using web usage mining
IJMIT JOURNAL
 
Automatic Recommendation for Online Users Using Web Usage Mining
Automatic Recommendation for Online Users Using Web Usage Mining Automatic Recommendation for Online Users Using Web Usage Mining
Automatic Recommendation for Online Users Using Web Usage Mining
IJMIT JOURNAL
 
Pxc3893553
Pxc3893553Pxc3893553
Pxc3893553
Ouzza Brahim
 
Implementation of Intelligent Web Server Monitoring
Implementation of Intelligent Web Server MonitoringImplementation of Intelligent Web Server Monitoring
Implementation of Intelligent Web Server Monitoring
iosrjce
 
C017231726
C017231726C017231726
C017231726
IOSR Journals
 
a novel technique to pre-process web log data using sql server management studio
a novel technique to pre-process web log data using sql server management studioa novel technique to pre-process web log data using sql server management studio
a novel technique to pre-process web log data using sql server management studio
INFOGAIN PUBLICATION
 
Logminingsurvey
LogminingsurveyLogminingsurvey
Logminingsurveydrewz lin
 
A Comparative Study of Recommendation System Using Web Usage Mining
A Comparative Study of Recommendation System Using Web Usage Mining A Comparative Study of Recommendation System Using Web Usage Mining
A Comparative Study of Recommendation System Using Web Usage Mining
Editor IJMTER
 
Performance of Real Time Web Traffic Analysis Using Feed Forward Neural Netw...
Performance of Real Time Web Traffic Analysis Using Feed  Forward Neural Netw...Performance of Real Time Web Traffic Analysis Using Feed  Forward Neural Netw...
Performance of Real Time Web Traffic Analysis Using Feed Forward Neural Netw...
IOSR Journals
 
A Novel Method for Data Cleaning and User- Session Identification for Web Mining
A Novel Method for Data Cleaning and User- Session Identification for Web MiningA Novel Method for Data Cleaning and User- Session Identification for Web Mining
A Novel Method for Data Cleaning and User- Session Identification for Web Mining
IJMER
 
Web Usage Mining: A Survey on User's Navigation Pattern from Web Logs
Web Usage Mining: A Survey on User's Navigation Pattern from Web LogsWeb Usage Mining: A Survey on User's Navigation Pattern from Web Logs
Web Usage Mining: A Survey on User's Navigation Pattern from Web Logs
ijsrd.com
 
Detection of Behavior using Machine Learning
Detection of Behavior using Machine LearningDetection of Behavior using Machine Learning
Detection of Behavior using Machine Learning
IRJET Journal
 
Web personalization using clustering of web usage data
Web personalization using clustering of web usage dataWeb personalization using clustering of web usage data
Web personalization using clustering of web usage data
ijfcstjournal
 
BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...
BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...
BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...
ijdkp
 
IRJET-A Survey on Web Personalization of Web Usage Mining
IRJET-A Survey on Web Personalization of Web Usage MiningIRJET-A Survey on Web Personalization of Web Usage Mining
IRJET-A Survey on Web Personalization of Web Usage Mining
IRJET Journal
 
An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...
An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...
An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...
IJSRD
 

Similar to Applying web mining application for user behavior understanding (20)

applyingwebminingapplicationforuserbehaviorunderstanding-131215105223-phpapp0...
applyingwebminingapplicationforuserbehaviorunderstanding-131215105223-phpapp0...applyingwebminingapplicationforuserbehaviorunderstanding-131215105223-phpapp0...
applyingwebminingapplicationforuserbehaviorunderstanding-131215105223-phpapp0...
 
Web Data mining-A Research area in Web usage mining
Web Data mining-A Research area in Web usage miningWeb Data mining-A Research area in Web usage mining
Web Data mining-A Research area in Web usage mining
 
COMPARISON ANALYSIS OF WEB USAGE MINING USING PATTERN RECOGNITION TECHNIQUES
COMPARISON ANALYSIS OF WEB USAGE MINING USING PATTERN RECOGNITION TECHNIQUESCOMPARISON ANALYSIS OF WEB USAGE MINING USING PATTERN RECOGNITION TECHNIQUES
COMPARISON ANALYSIS OF WEB USAGE MINING USING PATTERN RECOGNITION TECHNIQUES
 
Identifying the Number of Visitors to improve Website Usability from Educatio...
Identifying the Number of Visitors to improve Website Usability from Educatio...Identifying the Number of Visitors to improve Website Usability from Educatio...
Identifying the Number of Visitors to improve Website Usability from Educatio...
 
Automatic recommendation for online users using web usage mining
Automatic recommendation for online users using web usage miningAutomatic recommendation for online users using web usage mining
Automatic recommendation for online users using web usage mining
 
Automatic Recommendation for Online Users Using Web Usage Mining
Automatic Recommendation for Online Users Using Web Usage Mining Automatic Recommendation for Online Users Using Web Usage Mining
Automatic Recommendation for Online Users Using Web Usage Mining
 
Pxc3893553
Pxc3893553Pxc3893553
Pxc3893553
 
Implementation of Intelligent Web Server Monitoring
Implementation of Intelligent Web Server MonitoringImplementation of Intelligent Web Server Monitoring
Implementation of Intelligent Web Server Monitoring
 
C017231726
C017231726C017231726
C017231726
 
a novel technique to pre-process web log data using sql server management studio
a novel technique to pre-process web log data using sql server management studioa novel technique to pre-process web log data using sql server management studio
a novel technique to pre-process web log data using sql server management studio
 
Logminingsurvey
LogminingsurveyLogminingsurvey
Logminingsurvey
 
A Comparative Study of Recommendation System Using Web Usage Mining
A Comparative Study of Recommendation System Using Web Usage Mining A Comparative Study of Recommendation System Using Web Usage Mining
A Comparative Study of Recommendation System Using Web Usage Mining
 
Performance of Real Time Web Traffic Analysis Using Feed Forward Neural Netw...
Performance of Real Time Web Traffic Analysis Using Feed  Forward Neural Netw...Performance of Real Time Web Traffic Analysis Using Feed  Forward Neural Netw...
Performance of Real Time Web Traffic Analysis Using Feed Forward Neural Netw...
 
A Novel Method for Data Cleaning and User- Session Identification for Web Mining
A Novel Method for Data Cleaning and User- Session Identification for Web MiningA Novel Method for Data Cleaning and User- Session Identification for Web Mining
A Novel Method for Data Cleaning and User- Session Identification for Web Mining
 
Web Usage Mining: A Survey on User's Navigation Pattern from Web Logs
Web Usage Mining: A Survey on User's Navigation Pattern from Web LogsWeb Usage Mining: A Survey on User's Navigation Pattern from Web Logs
Web Usage Mining: A Survey on User's Navigation Pattern from Web Logs
 
Detection of Behavior using Machine Learning
Detection of Behavior using Machine LearningDetection of Behavior using Machine Learning
Detection of Behavior using Machine Learning
 
Web personalization using clustering of web usage data
Web personalization using clustering of web usage dataWeb personalization using clustering of web usage data
Web personalization using clustering of web usage data
 
BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...
BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...
BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...
 
IRJET-A Survey on Web Personalization of Web Usage Mining
IRJET-A Survey on Web Personalization of Web Usage MiningIRJET-A Survey on Web Personalization of Web Usage Mining
IRJET-A Survey on Web Personalization of Web Usage Mining
 
An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...
An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...
An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...
 

More from Zakaria Zubi

Knowledge Discovery Query Language (KDQL)
Knowledge Discovery Query Language (KDQL)Knowledge Discovery Query Language (KDQL)
Knowledge Discovery Query Language (KDQL)
Zakaria Zubi
 
Knowledge Discovery in Remote Access Databases
Knowledge Discovery in Remote Access Databases Knowledge Discovery in Remote Access Databases
Knowledge Discovery in Remote Access Databases
Zakaria Zubi
 
I- Extended Databases
I- Extended DatabasesI- Extended Databases
I- Extended Databases
Zakaria Zubi
 
Using Data Mining Techniques to Analyze Crime Pattern
Using Data Mining Techniques to Analyze Crime PatternUsing Data Mining Techniques to Analyze Crime Pattern
Using Data Mining Techniques to Analyze Crime Pattern
Zakaria Zubi
 
COMPARISON OF ROUTING PROTOCOLS FOR AD HOC WIRELESS NETWORK WITH MEDICAL DATA
COMPARISON OF ROUTING PROTOCOLS FOR AD HOC WIRELESS NETWORK WITH MEDICAL DATA COMPARISON OF ROUTING PROTOCOLS FOR AD HOC WIRELESS NETWORK WITH MEDICAL DATA
COMPARISON OF ROUTING PROTOCOLS FOR AD HOC WIRELESS NETWORK WITH MEDICAL DATA
Zakaria Zubi
 
A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...
A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...
A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...
Zakaria Zubi
 
Arabic Text mining Classification
Arabic Text mining Classification Arabic Text mining Classification
Arabic Text mining Classification Zakaria Zubi
 
Ibtc dwt hybrid coding of digital images
Ibtc dwt hybrid coding of digital imagesIbtc dwt hybrid coding of digital images
Ibtc dwt hybrid coding of digital imagesZakaria Zubi
 
Information communication technology in libya for educational purposes
Information communication technology in libya for educational purposesInformation communication technology in libya for educational purposes
Information communication technology in libya for educational purposesZakaria Zubi
 

More from Zakaria Zubi (13)

Knowledge Discovery Query Language (KDQL)
Knowledge Discovery Query Language (KDQL)Knowledge Discovery Query Language (KDQL)
Knowledge Discovery Query Language (KDQL)
 
Knowledge Discovery in Remote Access Databases
Knowledge Discovery in Remote Access Databases Knowledge Discovery in Remote Access Databases
Knowledge Discovery in Remote Access Databases
 
I- Extended Databases
I- Extended DatabasesI- Extended Databases
I- Extended Databases
 
Using Data Mining Techniques to Analyze Crime Pattern
Using Data Mining Techniques to Analyze Crime PatternUsing Data Mining Techniques to Analyze Crime Pattern
Using Data Mining Techniques to Analyze Crime Pattern
 
COMPARISON OF ROUTING PROTOCOLS FOR AD HOC WIRELESS NETWORK WITH MEDICAL DATA
COMPARISON OF ROUTING PROTOCOLS FOR AD HOC WIRELESS NETWORK WITH MEDICAL DATA COMPARISON OF ROUTING PROTOCOLS FOR AD HOC WIRELESS NETWORK WITH MEDICAL DATA
COMPARISON OF ROUTING PROTOCOLS FOR AD HOC WIRELESS NETWORK WITH MEDICAL DATA
 
Ismail&&ziko 2003
Ismail&&ziko 2003Ismail&&ziko 2003
Ismail&&ziko 2003
 
A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...
A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...
A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...
 
Arabic Text mining Classification
Arabic Text mining Classification Arabic Text mining Classification
Arabic Text mining Classification
 
Edi text
Edi textEdi text
Edi text
 
Model
ModelModel
Model
 
Ibtc dwt hybrid coding of digital images
Ibtc dwt hybrid coding of digital imagesIbtc dwt hybrid coding of digital images
Ibtc dwt hybrid coding of digital images
 
Deep Web mining
Deep Web miningDeep Web mining
Deep Web mining
 
Information communication technology in libya for educational purposes
Information communication technology in libya for educational purposesInformation communication technology in libya for educational purposes
Information communication technology in libya for educational purposes
 

Recently uploaded

Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 

Recently uploaded (20)

Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 

Applying web mining application for user behavior understanding

  • 1. APPLYING WEB MINING APPLICATION FOR USER BEHAVIOR UNDERSTANDING Dr. Zakaria Suliman Zubi Associate Professor Computer Science Department Faculty Of Science Sirte University, Libya LOGO
  • 3. LOGO Abstract Web usage mining (WUM) focuses on the discovering of potential knowledge from browsing patterns of the users. Which leads us to find the correlation between pages in the analysis stage. The primary data source used in web usage mining is the server log-files (web-logs).  Browsing web pages by the user leaves a lot of information in the log-file. Analyzing logfiles information drives us to understand the behavior of the user. Web log is an essential part for the web mining to extract usage patterns and study the visiting characteristics of user. Our paper focus on the use of web mining techniques to classify web pages type according to user visits.  This classification helps us to understand the user behavior. We also uses some classification and association rule techniques for discovering the potential knowledge from the browsing patterns.
  • 5. LOGO INTRODUCTION The Internet offers a huge, widely global information center for News, advertising, consume information, financial management, education, government, and e-commerce . The aim of using web mining techniques for understanding user behavior is to profile user characteristics. Web mining can be organized into three main categories: web content mining, web structure mining, and web usage mining.
  • 6. LOGO INTRODUCTION Cont.. Web Mining Web Structure Mining Web Content Mining Web Usage Mining 1-Web content mining analyzes web content such as text, multimedia data, and structured data (within web pages or linked across web pages). 2 -Web structure mining is the process of using graph and network mining theory and methods to analyze the nodes and connection structures on the Web. 3- Web Usage Mining is a special type of web mining tool, which can discover the knowledge in the hidden browsing patterns and analyses the visiting characteristics of the users.
  • 7. LOGO INTRODUCTION Cont.. The Primary Data of Web Usage Mining 1-Web server logs . 2-Data about visitors of the sites. 3-Registration forms. Fig 2:portion of a typical server log A standard log-file had the following format remotehost; logname; username; date; request; status; bytes[ where: remotehost: is the remote hostname or its IP address; logname:is the remote log name of the user; username: is the username with which the user has authenticated himself, date: is the date and time of the request, request: is the exact request line as it came from the client, status: is the HTTP status code returned to the client, and bytes: is the content-length of the document transferred.
  • 9. LOGO THE PHASES OF WEB USAGE MINING Web usage mining is a complete process that includes various stages of data mining cycle, including Data Preprocessing, Pattern Discovery & Pattern Analysis.  Initially, at the data preprocessing stage web log is preprocessed to clean, integrate and transform into a common log. In the pattern discovery: Data mining techniques are applied to discover the interesting characteristics in the hidden patterns. Pattern Analysis is the final stage of web usage mining which can validate interested patterns from the output of pattern discovery that can be used to predict user behavior.
  • 10. LOGO THE PHASES OF WEB USAGE MINING Data Preprocessing Process Data Cleaning: The log-file is first examined to remove irrelevant entries such as those that represent multimedia data and scripts or uninteresting entries such as those that belongs to top/bottom frames. Pageview Identification: Identification of page views is heavily dependent on the intra-page structure of the site, as well as on the page contents and the underlying site do-main knowledge. each pageview can be viewed as a collection of Web objects or resources representing a specific “user event,”. Data Cleaning Pageview Identification User Identification Session Identification
  • 11. LOGO THE PHASES OF WEB USAGE MINING Data Preprocessing Process User Identification: Since several users may share a single machine name, certain heuristics are used to identify users . We use the phrase user activity record to refer to the sequence of logged activities belonging to the same user. Session Identification:  Aims to split the page access of each user into separated sessions. It defines the number of times the user has accessed a web page and time out defines a time limit for the access of particular web page for more than 30 minutes if more the session will be divided in more than one session. Sample of user and sessions identification
  • 12. LOGO THE PHASES OF WEB USAGE MINING Pattern Discovery Process: Discovering user access pattern from the user access log files is the main purpose of using web usage mining . Association Rule Mining: Association rule mining discovery and statistical correlation analysis can find groups of web pages types that are commonly accessed together (Association rule mining can be used to discover correlation between pages types found in a web log) this technique is applied to user and session identification consisting of item where every item represents a page type ,we will also use Apriori algorithm to find the correlation between pages based on the confidence and support vectors. What are the set of pages type frequently accessed together by the web users. e.g (Sport, News, Social) What the page type will be fetched next. e.g Entertainment
  • 13. LOGO THE PHASES OF WEB USAGE MINING Classification Classification techniques play an important role in Web analytics applications for modeling the users according to various predefined metrics. In the Web domain, we are interested in developing a profile of users belonging to a particular class or category . This requires extraction and selection of features that best describe the properties of a given class or category. We will focus also on k-nearest neighbor (K-NN) which was considered as a predictive technique for classification models. Whereas;  k represents a number of similar cases or the number of items in the group.
  • 14. LOGO THE PHASES OF WEB USAGE MINING Pattern Analysis Process: In this stage of process the discovered patterns will further processed ,filtered ,possibly resulting in aggregate user models that can be used as a visualizations tools ,the next figure summarizes the whole process:
  • 16. RESULTS OF USING ASSOCIATION RULES LOGO Log-file in a flat file format. Import log-file database to our implemented application.
  • 17. RESULTS OF USING ASSOCIATION RULES LOGO Extract the transactional database of web sever log for every user where every transaction represents a session. Find the association rules of user behavior after applying the Aprori algorithm to the transactional database of the user.
  • 19. LOGO CONCLUSION  We used web data that contained all the information about the user. When the user leaves accessing the web pages. This data is called web logs or (serverlogs) A statistical methods such as classification, association rule mining discovery and statistical correlation analysis which can find groups of web pages types that are commonly accessed together are applied as well. Classification is used to map the data item into one of several predefined classes. The class will belongs into one category such as sport or politics or education or..etc. We also uses the k-nearest neighbor (K-NN) algorithm as a common classification method to select the best class. Association rule mining was used to discover correlation between sites types found in a web log. The implemented application program was designed in C# programming language.