In the recent years with the development of Internet technology the growth of World Wide Web exceeded all expectations. A lot of information is available in different formats and retrieving interesting content has become a very difficult task. One possible approach to solve this problem is Web Usage Mining (WUM), the important application of Web Mining. Extracting the hidden knowledge in the log files of a web server, recognizing various interests of web users, discovering customer behavior while at the site are normally referred as the applications of web usage mining. In this paper we provide an updated focused survey on techniques of web usage mining.
AN INTELLIGENT OPTIMAL GENETIC MODEL TO INVESTIGATE THE USER USAGE BEHAVIOUR ...ijdkp
The unexpected wide spread use of WWW and dynamically increasing nature of the web creates new
challenges in the web mining since the data in the web inherently unlabelled, incomplete, non linear, and
heterogeneous. The investigation of user usage behaviour on WWW is real time problem which involves
multiple conflicting measures of performance. These measures make not only computational intensive but
also needs to the possibility of be unable to find the exact solution. Unfortunately, the conventional methods
are limited to optimization problems due to the absence of semantic certainty and presence of human
intervention. In handling such data and overcome the limitations of conventional methodologies it is
necessary to use a soft computing model that can work intelligently to attain optimal solution.
A NEW IMPROVED WEIGHTED ASSOCIATION RULE MINING WITH DYNAMIC PROGRAMMING APPR...cscpconf
With the rapid development of Internet, Web search has been taken an important role in our
ordinary life. In web search, mining frequent patterns in large database is a major research area. Due to increase of user activities on web, web-searching methods, to predict the nextrequest of user visits in web pages plays a major role. Web searching methods are helpful to provide quality results, timely answer and also offer a customized navigation. In web search, Association rule mining is an important data analysis method to discover associated web pages. Most of the researchers implemented association mining using Apriori algorithm with binary representation. The problem of this approach is not address the issue like the navigation order of web pages. To overcome this problem researchers proposed a weighted Apriori to maintain navigation order but unable to produce optimal results. With the goal of a most favorable result we proposed a novel approach which combines weighted Apriori and dynamic programming. The experimental result shows that this approach maintains the navigation order of web pages and achieves a best solution. The proposed technique enhances the web site effectiveness, increases the user browsing knowledge, improves the prediction accuracy and decreases the computational complexities.
IRJET-A Survey on Web Personalization of Web Usage MiningIRJET Journal
S.Jagan, Dr.S.P.Rajagopalan "A Survey on Web Personalization of Web Usage Mining", International Research Journal of Engineering and Technology (IRJET),Volume 2,issue-01 Mar-2015. e-ISSN:2395-0056, p-ISSN:2395-0072. www.irjet.net , published by Fast Track Publications
Abstract
Now a day, World Wide Web (www) is a rich and most powerful source of information. Day by day it is becoming more complex and expanding in size to get maximum information details online. However, it is becoming more complex and critical task to retrieve exact information expected by its users. To deal with this problem one more powerful concept is personalization which is becoming more powerful now days. Personalization is a subclass of information filtering system that seek to predict the 'ratings' or 'preferences' that a user would give to an items, they had not yet considered, using a model built from the characteristics of an item (content-based approaches or collaborative filtering approaches). Web mining is an emerging field of data mining used to provide personalization on the web. It consist three major categories i.e. Web Content Mining, Web Usage Mining, and Web Structure Mining. This paper focuses on web usage mining and algorithms used for providing personalization on the web.
This document proposes a new technique to enhance the learning capabilities and reduce the computation intensity of a competitive learning multi-layered neural network using the K-means clustering algorithm. The proposed model uses a multi-layered network architecture with backpropagation learning to analyze web log data. Data preprocessing steps like cleaning, user identification, and transaction identification are applied to prepare the enterprise proxy log data for analysis. The proposed framework aims to discover useful patterns from web log data through a combination of K-means clustering and a feedforward neural network.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Integrated Web Recommendation Model with Improved Weighted Association Rule M...ijdkp
World Wide Web plays a significant role in human life. It requires a technological improvement to satisfy
the user needs. Web log data is essential for improving the performance of the web. It contains large,
heterogeneous and diverse data. Analyzing g the web log data is a tedious process for Web developers,
Web designers, technologists and end users. In this work, a new weighted association mining algorithm is
developed to identify the best association rules that are useful for web site restructuring and
recommendation that reduces false visit and improve users’ navigation behavior. The algorithm finds the
frequent item set from a large uncertain database. Frequent scanning of database in each time is the
problem with the existing algorithms which leads to complex output set and time consuming process. The
proposed algorithm scans the database only once at the beginning of the process and the generated
frequent item sets, which are stored into the database. The evaluation parameters such as support,
confidence, lift and number of rules are considered to analyze the performance of proposed algorithm and
traditional association mining algorithm. The new algorithm produced best result that helps the developer
to restructure their website in a way to meet the requirements of the end user within short time span.
Web Page Recommendation Using Web MiningIJERA Editor
On World Wide Web various kind of content are generated in huge amount, so to give relevant result to user web recommendation become important part of web application. On web different kind of web recommendation are made available to user every day that includes Image, Video, Audio, query suggestion and web page. In this paper we are aiming at providing framework for web page recommendation. 1) First we describe the basics of web mining, types of web mining. 2) Details of each web mining technique.3)We propose the architecture for the personalized web page recommendation.
A Web Extraction Using Soft Algorithm for Trinity Structureiosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
AN INTELLIGENT OPTIMAL GENETIC MODEL TO INVESTIGATE THE USER USAGE BEHAVIOUR ...ijdkp
The unexpected wide spread use of WWW and dynamically increasing nature of the web creates new
challenges in the web mining since the data in the web inherently unlabelled, incomplete, non linear, and
heterogeneous. The investigation of user usage behaviour on WWW is real time problem which involves
multiple conflicting measures of performance. These measures make not only computational intensive but
also needs to the possibility of be unable to find the exact solution. Unfortunately, the conventional methods
are limited to optimization problems due to the absence of semantic certainty and presence of human
intervention. In handling such data and overcome the limitations of conventional methodologies it is
necessary to use a soft computing model that can work intelligently to attain optimal solution.
A NEW IMPROVED WEIGHTED ASSOCIATION RULE MINING WITH DYNAMIC PROGRAMMING APPR...cscpconf
With the rapid development of Internet, Web search has been taken an important role in our
ordinary life. In web search, mining frequent patterns in large database is a major research area. Due to increase of user activities on web, web-searching methods, to predict the nextrequest of user visits in web pages plays a major role. Web searching methods are helpful to provide quality results, timely answer and also offer a customized navigation. In web search, Association rule mining is an important data analysis method to discover associated web pages. Most of the researchers implemented association mining using Apriori algorithm with binary representation. The problem of this approach is not address the issue like the navigation order of web pages. To overcome this problem researchers proposed a weighted Apriori to maintain navigation order but unable to produce optimal results. With the goal of a most favorable result we proposed a novel approach which combines weighted Apriori and dynamic programming. The experimental result shows that this approach maintains the navigation order of web pages and achieves a best solution. The proposed technique enhances the web site effectiveness, increases the user browsing knowledge, improves the prediction accuracy and decreases the computational complexities.
IRJET-A Survey on Web Personalization of Web Usage MiningIRJET Journal
S.Jagan, Dr.S.P.Rajagopalan "A Survey on Web Personalization of Web Usage Mining", International Research Journal of Engineering and Technology (IRJET),Volume 2,issue-01 Mar-2015. e-ISSN:2395-0056, p-ISSN:2395-0072. www.irjet.net , published by Fast Track Publications
Abstract
Now a day, World Wide Web (www) is a rich and most powerful source of information. Day by day it is becoming more complex and expanding in size to get maximum information details online. However, it is becoming more complex and critical task to retrieve exact information expected by its users. To deal with this problem one more powerful concept is personalization which is becoming more powerful now days. Personalization is a subclass of information filtering system that seek to predict the 'ratings' or 'preferences' that a user would give to an items, they had not yet considered, using a model built from the characteristics of an item (content-based approaches or collaborative filtering approaches). Web mining is an emerging field of data mining used to provide personalization on the web. It consist three major categories i.e. Web Content Mining, Web Usage Mining, and Web Structure Mining. This paper focuses on web usage mining and algorithms used for providing personalization on the web.
This document proposes a new technique to enhance the learning capabilities and reduce the computation intensity of a competitive learning multi-layered neural network using the K-means clustering algorithm. The proposed model uses a multi-layered network architecture with backpropagation learning to analyze web log data. Data preprocessing steps like cleaning, user identification, and transaction identification are applied to prepare the enterprise proxy log data for analysis. The proposed framework aims to discover useful patterns from web log data through a combination of K-means clustering and a feedforward neural network.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Integrated Web Recommendation Model with Improved Weighted Association Rule M...ijdkp
World Wide Web plays a significant role in human life. It requires a technological improvement to satisfy
the user needs. Web log data is essential for improving the performance of the web. It contains large,
heterogeneous and diverse data. Analyzing g the web log data is a tedious process for Web developers,
Web designers, technologists and end users. In this work, a new weighted association mining algorithm is
developed to identify the best association rules that are useful for web site restructuring and
recommendation that reduces false visit and improve users’ navigation behavior. The algorithm finds the
frequent item set from a large uncertain database. Frequent scanning of database in each time is the
problem with the existing algorithms which leads to complex output set and time consuming process. The
proposed algorithm scans the database only once at the beginning of the process and the generated
frequent item sets, which are stored into the database. The evaluation parameters such as support,
confidence, lift and number of rules are considered to analyze the performance of proposed algorithm and
traditional association mining algorithm. The new algorithm produced best result that helps the developer
to restructure their website in a way to meet the requirements of the end user within short time span.
Web Page Recommendation Using Web MiningIJERA Editor
On World Wide Web various kind of content are generated in huge amount, so to give relevant result to user web recommendation become important part of web application. On web different kind of web recommendation are made available to user every day that includes Image, Video, Audio, query suggestion and web page. In this paper we are aiming at providing framework for web page recommendation. 1) First we describe the basics of web mining, types of web mining. 2) Details of each web mining technique.3)We propose the architecture for the personalized web page recommendation.
A Web Extraction Using Soft Algorithm for Trinity Structureiosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
World Wide Web is a huge repository of information and there is a tremendous increase in the volume of
information daily. The number of users are also increasing day by day. To reduce users browsing time lot
of research is taken place. Web Usage Mining is a type of web mining in which mining techniques are
applied in log data to extract the behaviour of users. Clustering plays an important role in a broad range
of applications like Web analysis, CRM, marketing, medical diagnostics, computational biology, and many
others. Clustering is the grouping of similar instances or objects. The key factor for clustering is some sort
of measure that can determine whether two objects are similar or dissimilar. In this paper a novel
clustering method to partition user sessions into accurate clusters is discussed. The accuracy and various
performance measures of the proposed algorithm shows that the proposed method is a better method for
web log mining.
An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...IJSRD
This document discusses an enhanced approach for detecting user behavior through country-wise local search. It begins with an abstract describing the development of the web and challenges in the field. It then discusses various techniques for web mining including web usage mining, web content mining, and web structure mining. It also discusses sequential pattern mining algorithms and procedures for recommendation systems. The key contribution is proposing a new local search algorithm for country-wise search to make searching more efficient based on local results.
International Journal of Engineering Research and DevelopmentIJERD Editor
Electrical, Electronics and Computer Engineering,
Information Engineering and Technology,
Mechanical, Industrial and Manufacturing Engineering,
Automation and Mechatronics Engineering,
Material and Chemical Engineering,
Civil and Architecture Engineering,
Biotechnology and Bio Engineering,
Environmental Engineering,
Petroleum and Mining Engineering,
Marine and Agriculture engineering,
Aerospace Engineering.
An effective search on web log from most popular downloaded contentijdpsjournal
A Web page recommender system effectively predicts the best related web page to search. While search
ing
a word from search engine it may display some unnecessary links and unrelated data’s to user so to a
void
this problem, the con
ceptual prediction model combines both the web usage and domain knowledge. The
proposed conceptual prediction model automatically generates a semantic network of the semantic Web
usage knowledge, which is the integration of domain knowledge and web usage i
nformation. Web usage
mining aims to discover interesting and frequent user access patterns from web browsing data. The
discovered knowledge can then be used for many practical web applications such as web
recommendations, adaptive web sites, and personali
zed web search and surfing
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
In this world of information technology, everyone has the tendency to do business electronically. Today
lot of businesses are happening on World Wide Web (WWW), it is very important for the website owner to
provide a better platform to attract more customers for their site. Providing information in a better way is
the solution to bring more customers or users. Customer is the end-user, who accessing the information
in a way it yields some credit to the web site owners. In this paper we define web mining and present a
method to utilize web mining in a better way to know the users and website behaviour which in turn
enhance the web site information to attract more users. This paper also presents an overview of the
various researches done on pattern extraction, web content mining and how it can be taken as a catalyst
for E-business.
The document describes a proposed fuzzy logic-based model for classifying web users in a personalized search system. The model collects user browsing data using a customized browser. It then fuzzifies the data and generates fuzzy rules using decision trees. These rules are used to label search pages and group users according to their search interests. The model is evaluated against a Bayesian classifier and shown to perform better. The goal is to handle the dynamic and fluctuating nature of user behavior and interests that exist in a personalized web search environment.
A Review: Text Classification on Social Media DataIOSR Journals
This document provides a review of different classifiers used for text classification on social media data. It discusses how social media data is often unstructured and contains users' opinions and sentiments. Various machine learning algorithms can be used to classify this social media text data, extracting meaningful information. The document focuses on describing Naive Bayes classifiers, which are commonly used for text classification tasks. It explains how Naive Bayes classifiers work by calculating the posterior probability that a document belongs to a certain class, based on applying Bayes' theorem with an independence assumption between features.
The document discusses a review process for analyzing contextual human information behavior factors in web usage mining. It first searches journals and search engines to find empirical studies related to gender differences, prior knowledge and cognitive styles. These studies are then examined to analyze how these three human factors impact web-based interactions. While some commercial analysis applications exist, more work still needs to be done by researchers and developers to build efficient and powerful tools for studying human information behavior.
Personalized web search using browsing history and domain knowledgeRishikesh Pathak
This document proposes a framework for improving personalized web search by constructing an enhanced user profile using both the user's browsing history and domain knowledge. The enhanced user profile is used to better suggest relevant web pages to the user based on their search query. An experiment found that suggestions made using the enhanced user profile performed better than using a standard user profile alone. The framework involves modeling the user, re-ranking search results, and displaying personalized results based on the enhanced user profile.
Classification-based Retrieval Methods to Enhance Information Discovery on th...IJMIT JOURNAL
The widespread adoption of the World-Wide Web (the Web) has created challenges both for society as a whole and for the technology used to build and maintain the Web. The ongoing struggle of information retrieval systems is to wade through this vast pile of data and satisfy users by presenting them with information that most adequately it’s their needs. On a societal level, the Web is expanding faster than we can comprehend its implications or develop rules for its use. The ubiquitous use of the Web has raised important social concerns in the areas of privacy, censorship, and access to information. On a technical level, the novelty of the Web and the pace of its growth have created challenges not only in the development of new applications that realize the power of the Web, but also in the technology needed to scale applications to accommodate the resulting large data sets and heavy loads. This thesis presents searching algorithms and hierarchical classification techniques for increasing a search service's understanding of web queries. Existing search services rely solely on a query's occurrence in the document collection to locate relevant documents. They typically do not perform any task or topic-based analysis of queries using other available resources, and do not leverage changes in user query patterns over time. Provided within are a set of techniques and metrics for performing temporal analysis on query logs. Our log analyses are shown to be reasonable and informative, and can be used to detect changing trends and patterns in the query stream, thus providing valuable data to a search service.
The document discusses various techniques for web crawling and focused web crawling. It describes the functions of web crawlers including web content mining, web structure mining, and web usage mining. It also discusses different types of crawlers and compares algorithms for focused crawling such as decision trees, neural networks, and naive bayes. The goal of focused crawling is to improve precision and download only relevant pages through relevancy prediction.
Scaling Down Dimensions and Feature Extraction in Document Repository Classif...ijdmtaiir
-In this study a comprehensive evaluation of two
supervised feature selection methods for dimensionality
reduction is performed - Latent Semantic Indexing (LSI) and
Principal Component Analysis (PCA). This is gauged against
unsupervised techniques like fuzzy feature clustering using
hard fuzzy C-means (FCM) . The main objective of the study is
to estimate the relative efficiency of two supervised techniques
against unsupervised fuzzy techniques while reducing the
feature space. It is found that clustering using FCM leads to
better accuracy in classifying documents in the face of
evolutionary algorithms like LSI and PCA. Results show that
the clustering of features improves the accuracy of document
classification
Identifying the Number of Visitors to improve Website Usability from Educatio...Editor IJCATR
Web usage mining deals with understanding the Visitor’s behaviour with a Website. It helps in understanding the concerns
such as present and future probability of every website user, relationship between behaviour and website usability. It has different
branches such as web content mining, web structure and web usage mining. The focus of this paper is on web mining usage patterns of
an educational institution web log data. There are three types of web related log data namely web access log, error log and proxy log
data. In this paper web access log data has been used as dataset because the web access log data is the typical source of navigational
behaviour of the website visitor. The study of web server log analysis is helpful in applying the web mining techniques.
IJRET : International Journal of Research in Engineering and TechnologyImprov...eSAT Publishing House
This document summarizes techniques for improving web search results through web personalization. It discusses how web usage mining can be used to optimize information by monitoring user interaction histories and profiles. The proposed system aims to reduce manual user feedback by implicitly gathering preferences from behaviors like click-through rates and dwell times. It introduces an algorithm that calculates new ranking values for websites based on keyword matches and time spent on pages, and swaps ranks accordingly. This system provides personalized search results by continuously updating rankings based on implicit user interactions.
This document discusses interactive visualization techniques for information retrieval. It begins by stating that information retrieval systems often return many results, some more relevant than others. While search engines have grown, problems remain with low precision and recall. Visualization techniques can help users better understand retrieval results. The document then reviews several visualization methods like tree views, title views, and bubble views that can enhance web information retrieval systems by helping users browse, filter, and reformulate queries. It argues visualization is an effective tool for dealing with large numbers of documents returned in web searches.
A novel method for generating an elearning ontologyIJDKP
The Semantic Web provides a common framework that allows data to be shared and reused across
applications, enterprises, and community boundaries. The existing web applications need to express
semantics that can be extracted from users' navigation and content, in order to fulfill users' needs. Elearning
has specific requirements that can be satisfied through the extraction of semantics from learning
management systems (LMS) that use relational databases (RDB) as backend. In this paper, we propose
transformation rules for building owl ontology from the RDB of the open source LMS Moodle. It allows
transforming all possible cases in RDBs into ontological constructs. The proposed rules are enriched by
analyzing stored data to detect disjointness and totalness constraints in hierarchies, and calculating the
participation level of tables in n-ary relations. In addition, our technique is generic; hence it can be applied
to any RDB.
This document summarizes a research paper that proposes a bi-objective recommendation framework (BORF) for venue recommendation on mobile social networks. The BORF uses multiple objective optimization techniques to generate personalized recommendations. It addresses issues like cold start and data sparsity by preprocessing data using a Hub-Average inference model. Both scalar optimization using a Weighted Sum Approach and vector optimization using a NSGA-II evolutionary algorithm are implemented to provide optimal venue recommendations to users. Experimental results on a large real-world dataset confirm the accuracy of the proposed recommendation system.
This document summarizes previous work on content extraction from web pages and proposes a new approach. It discusses existing methods that use techniques like entropy analysis, DOM trees, clustering, and ratios of text, links and tags. The proposed approach combines word to leaf ratio with text link ratio and link text ratio to identify informative nodes in the DOM tree. It calculates weights and relative positions of nodes to select the most informative content. The method will be tested on different website types and compared to existing approaches.
A detail survey of page re ranking various web features and techniquesijctet
This document discusses techniques for page re-ranking on websites based on user behavior analysis. It describes how web usage mining involves analyzing web server logs to extract patterns in user behavior. Common techniques discussed for page re-ranking include Markov models, data mining approaches like clustering and association rule mining, and analyzing linked web page structures. The goal is to better understand user interests and predict future page access to improve information retrieval and optimize website design.
Performance of Real Time Web Traffic Analysis Using Feed Forward Neural Netw...IOSR Journals
This document discusses using feed forward neural networks and K-means clustering to analyze real-time web traffic. It proposes a technique to enhance the learning capabilities and reduce the computation intensity of a competitive learning multi-layered neural network using the K-means clustering algorithm. The model uses a multi-layered network architecture with backpropagation learning to discover and analyze knowledge from web log data. It also discusses preprocessing the web log data through cleaning, user identification, filtering, session identification and transaction identification before applying the neural network and K-means algorithms.
Web Usage Mining: A Survey on User's Navigation Pattern from Web Logsijsrd.com
With an expontial growth of World Wide Web, there are so many information overloaded and it became hard to find out data according to need. Web usage mining is a part of web mining, which deal with automatic discovery of user navigation pattern from web log. This paper presents an overview of web mining and also provide navigation pattern from classification and clustering algorithm for web usage mining. Web usage mining contain three important task namely data preprocessing, pattern discovery and pattern analysis based on discovered pattern. And also contain the comparative study of web mining techniques.
World Wide Web is a huge repository of information and there is a tremendous increase in the volume of
information daily. The number of users are also increasing day by day. To reduce users browsing time lot
of research is taken place. Web Usage Mining is a type of web mining in which mining techniques are
applied in log data to extract the behaviour of users. Clustering plays an important role in a broad range
of applications like Web analysis, CRM, marketing, medical diagnostics, computational biology, and many
others. Clustering is the grouping of similar instances or objects. The key factor for clustering is some sort
of measure that can determine whether two objects are similar or dissimilar. In this paper a novel
clustering method to partition user sessions into accurate clusters is discussed. The accuracy and various
performance measures of the proposed algorithm shows that the proposed method is a better method for
web log mining.
An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...IJSRD
This document discusses an enhanced approach for detecting user behavior through country-wise local search. It begins with an abstract describing the development of the web and challenges in the field. It then discusses various techniques for web mining including web usage mining, web content mining, and web structure mining. It also discusses sequential pattern mining algorithms and procedures for recommendation systems. The key contribution is proposing a new local search algorithm for country-wise search to make searching more efficient based on local results.
International Journal of Engineering Research and DevelopmentIJERD Editor
Electrical, Electronics and Computer Engineering,
Information Engineering and Technology,
Mechanical, Industrial and Manufacturing Engineering,
Automation and Mechatronics Engineering,
Material and Chemical Engineering,
Civil and Architecture Engineering,
Biotechnology and Bio Engineering,
Environmental Engineering,
Petroleum and Mining Engineering,
Marine and Agriculture engineering,
Aerospace Engineering.
An effective search on web log from most popular downloaded contentijdpsjournal
A Web page recommender system effectively predicts the best related web page to search. While search
ing
a word from search engine it may display some unnecessary links and unrelated data’s to user so to a
void
this problem, the con
ceptual prediction model combines both the web usage and domain knowledge. The
proposed conceptual prediction model automatically generates a semantic network of the semantic Web
usage knowledge, which is the integration of domain knowledge and web usage i
nformation. Web usage
mining aims to discover interesting and frequent user access patterns from web browsing data. The
discovered knowledge can then be used for many practical web applications such as web
recommendations, adaptive web sites, and personali
zed web search and surfing
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
In this world of information technology, everyone has the tendency to do business electronically. Today
lot of businesses are happening on World Wide Web (WWW), it is very important for the website owner to
provide a better platform to attract more customers for their site. Providing information in a better way is
the solution to bring more customers or users. Customer is the end-user, who accessing the information
in a way it yields some credit to the web site owners. In this paper we define web mining and present a
method to utilize web mining in a better way to know the users and website behaviour which in turn
enhance the web site information to attract more users. This paper also presents an overview of the
various researches done on pattern extraction, web content mining and how it can be taken as a catalyst
for E-business.
The document describes a proposed fuzzy logic-based model for classifying web users in a personalized search system. The model collects user browsing data using a customized browser. It then fuzzifies the data and generates fuzzy rules using decision trees. These rules are used to label search pages and group users according to their search interests. The model is evaluated against a Bayesian classifier and shown to perform better. The goal is to handle the dynamic and fluctuating nature of user behavior and interests that exist in a personalized web search environment.
A Review: Text Classification on Social Media DataIOSR Journals
This document provides a review of different classifiers used for text classification on social media data. It discusses how social media data is often unstructured and contains users' opinions and sentiments. Various machine learning algorithms can be used to classify this social media text data, extracting meaningful information. The document focuses on describing Naive Bayes classifiers, which are commonly used for text classification tasks. It explains how Naive Bayes classifiers work by calculating the posterior probability that a document belongs to a certain class, based on applying Bayes' theorem with an independence assumption between features.
The document discusses a review process for analyzing contextual human information behavior factors in web usage mining. It first searches journals and search engines to find empirical studies related to gender differences, prior knowledge and cognitive styles. These studies are then examined to analyze how these three human factors impact web-based interactions. While some commercial analysis applications exist, more work still needs to be done by researchers and developers to build efficient and powerful tools for studying human information behavior.
Personalized web search using browsing history and domain knowledgeRishikesh Pathak
This document proposes a framework for improving personalized web search by constructing an enhanced user profile using both the user's browsing history and domain knowledge. The enhanced user profile is used to better suggest relevant web pages to the user based on their search query. An experiment found that suggestions made using the enhanced user profile performed better than using a standard user profile alone. The framework involves modeling the user, re-ranking search results, and displaying personalized results based on the enhanced user profile.
Classification-based Retrieval Methods to Enhance Information Discovery on th...IJMIT JOURNAL
The widespread adoption of the World-Wide Web (the Web) has created challenges both for society as a whole and for the technology used to build and maintain the Web. The ongoing struggle of information retrieval systems is to wade through this vast pile of data and satisfy users by presenting them with information that most adequately it’s their needs. On a societal level, the Web is expanding faster than we can comprehend its implications or develop rules for its use. The ubiquitous use of the Web has raised important social concerns in the areas of privacy, censorship, and access to information. On a technical level, the novelty of the Web and the pace of its growth have created challenges not only in the development of new applications that realize the power of the Web, but also in the technology needed to scale applications to accommodate the resulting large data sets and heavy loads. This thesis presents searching algorithms and hierarchical classification techniques for increasing a search service's understanding of web queries. Existing search services rely solely on a query's occurrence in the document collection to locate relevant documents. They typically do not perform any task or topic-based analysis of queries using other available resources, and do not leverage changes in user query patterns over time. Provided within are a set of techniques and metrics for performing temporal analysis on query logs. Our log analyses are shown to be reasonable and informative, and can be used to detect changing trends and patterns in the query stream, thus providing valuable data to a search service.
The document discusses various techniques for web crawling and focused web crawling. It describes the functions of web crawlers including web content mining, web structure mining, and web usage mining. It also discusses different types of crawlers and compares algorithms for focused crawling such as decision trees, neural networks, and naive bayes. The goal of focused crawling is to improve precision and download only relevant pages through relevancy prediction.
Scaling Down Dimensions and Feature Extraction in Document Repository Classif...ijdmtaiir
-In this study a comprehensive evaluation of two
supervised feature selection methods for dimensionality
reduction is performed - Latent Semantic Indexing (LSI) and
Principal Component Analysis (PCA). This is gauged against
unsupervised techniques like fuzzy feature clustering using
hard fuzzy C-means (FCM) . The main objective of the study is
to estimate the relative efficiency of two supervised techniques
against unsupervised fuzzy techniques while reducing the
feature space. It is found that clustering using FCM leads to
better accuracy in classifying documents in the face of
evolutionary algorithms like LSI and PCA. Results show that
the clustering of features improves the accuracy of document
classification
Identifying the Number of Visitors to improve Website Usability from Educatio...Editor IJCATR
Web usage mining deals with understanding the Visitor’s behaviour with a Website. It helps in understanding the concerns
such as present and future probability of every website user, relationship between behaviour and website usability. It has different
branches such as web content mining, web structure and web usage mining. The focus of this paper is on web mining usage patterns of
an educational institution web log data. There are three types of web related log data namely web access log, error log and proxy log
data. In this paper web access log data has been used as dataset because the web access log data is the typical source of navigational
behaviour of the website visitor. The study of web server log analysis is helpful in applying the web mining techniques.
IJRET : International Journal of Research in Engineering and TechnologyImprov...eSAT Publishing House
This document summarizes techniques for improving web search results through web personalization. It discusses how web usage mining can be used to optimize information by monitoring user interaction histories and profiles. The proposed system aims to reduce manual user feedback by implicitly gathering preferences from behaviors like click-through rates and dwell times. It introduces an algorithm that calculates new ranking values for websites based on keyword matches and time spent on pages, and swaps ranks accordingly. This system provides personalized search results by continuously updating rankings based on implicit user interactions.
This document discusses interactive visualization techniques for information retrieval. It begins by stating that information retrieval systems often return many results, some more relevant than others. While search engines have grown, problems remain with low precision and recall. Visualization techniques can help users better understand retrieval results. The document then reviews several visualization methods like tree views, title views, and bubble views that can enhance web information retrieval systems by helping users browse, filter, and reformulate queries. It argues visualization is an effective tool for dealing with large numbers of documents returned in web searches.
A novel method for generating an elearning ontologyIJDKP
The Semantic Web provides a common framework that allows data to be shared and reused across
applications, enterprises, and community boundaries. The existing web applications need to express
semantics that can be extracted from users' navigation and content, in order to fulfill users' needs. Elearning
has specific requirements that can be satisfied through the extraction of semantics from learning
management systems (LMS) that use relational databases (RDB) as backend. In this paper, we propose
transformation rules for building owl ontology from the RDB of the open source LMS Moodle. It allows
transforming all possible cases in RDBs into ontological constructs. The proposed rules are enriched by
analyzing stored data to detect disjointness and totalness constraints in hierarchies, and calculating the
participation level of tables in n-ary relations. In addition, our technique is generic; hence it can be applied
to any RDB.
This document summarizes a research paper that proposes a bi-objective recommendation framework (BORF) for venue recommendation on mobile social networks. The BORF uses multiple objective optimization techniques to generate personalized recommendations. It addresses issues like cold start and data sparsity by preprocessing data using a Hub-Average inference model. Both scalar optimization using a Weighted Sum Approach and vector optimization using a NSGA-II evolutionary algorithm are implemented to provide optimal venue recommendations to users. Experimental results on a large real-world dataset confirm the accuracy of the proposed recommendation system.
This document summarizes previous work on content extraction from web pages and proposes a new approach. It discusses existing methods that use techniques like entropy analysis, DOM trees, clustering, and ratios of text, links and tags. The proposed approach combines word to leaf ratio with text link ratio and link text ratio to identify informative nodes in the DOM tree. It calculates weights and relative positions of nodes to select the most informative content. The method will be tested on different website types and compared to existing approaches.
A detail survey of page re ranking various web features and techniquesijctet
This document discusses techniques for page re-ranking on websites based on user behavior analysis. It describes how web usage mining involves analyzing web server logs to extract patterns in user behavior. Common techniques discussed for page re-ranking include Markov models, data mining approaches like clustering and association rule mining, and analyzing linked web page structures. The goal is to better understand user interests and predict future page access to improve information retrieval and optimize website design.
Performance of Real Time Web Traffic Analysis Using Feed Forward Neural Netw...IOSR Journals
This document discusses using feed forward neural networks and K-means clustering to analyze real-time web traffic. It proposes a technique to enhance the learning capabilities and reduce the computation intensity of a competitive learning multi-layered neural network using the K-means clustering algorithm. The model uses a multi-layered network architecture with backpropagation learning to discover and analyze knowledge from web log data. It also discusses preprocessing the web log data through cleaning, user identification, filtering, session identification and transaction identification before applying the neural network and K-means algorithms.
Web Usage Mining: A Survey on User's Navigation Pattern from Web Logsijsrd.com
With an expontial growth of World Wide Web, there are so many information overloaded and it became hard to find out data according to need. Web usage mining is a part of web mining, which deal with automatic discovery of user navigation pattern from web log. This paper presents an overview of web mining and also provide navigation pattern from classification and clustering algorithm for web usage mining. Web usage mining contain three important task namely data preprocessing, pattern discovery and pattern analysis based on discovered pattern. And also contain the comparative study of web mining techniques.
Advance Clustering Technique Based on Markov Chain for Predicting Next User M...idescitation
According to the survey India is one of the
leading countries in the word for technical education and
management education. Numbers of students are increasing
day by day by the growth rate of 45% per annum. Advancement
in technology puts special effect on education system. This
helps in upgrading higher education. Some universities and
colleges are using these technologies. Weblog is one of them.
Main aim of this paper is to represent web logs using clustering
technique for predicting next user movement and user
behavior analysis. This paper moves around the web log
clustering technique based on Markov chain results .In this
paper we present an ideal approach to web clustering
(clustering web site users) and predicting their behavior for
next visit. Methodology: For generating effective result approx
14 engineering college web usage data is used and an advance
clustering approach is presenting after optimizing the other
clustering approach.Results: The user behavior is predicted
with the help of the advance clustering approach based on the
FPCM and k-mean. Proposed algorithm is used to mined and
predict user’s preferred paths. To predict the user behavior
existing approaches have been used. But the existing
approaches are not enough because of its reaction towards
noise. Thus with the help of ACM, noise is reduced, provides
more accurate result for predicting the user behavior. Approach
Implementation:The algorithm was implemented in MAT
LAB, DTRG and in Java .The experiment result proves that
this method is very effective in predicting user behavior. The
experimental results have validated the method’s effectiveness
in comparison with some previous studies.
Enactment of Firefly Algorithm and Fuzzy C-Means Clustering For Consumer Requ...IRJET Journal
The document proposes a novel methodology for predicting consumer demand and future requests on web pages using a hybrid approach. It first classifies consumers as potential or non-potential using a firefly-based neural network with Levenberg-Marquardt algorithm. Potential consumer data is then clustered using an improved fuzzy C-means clustering algorithm. Finally, upcoming consumer demand is predicted by analyzing patterns and recommending web pages with higher weights. The proposed approach is implemented in Java and CloudSim and aims to overcome limitations of existing recommendation systems by providing more accurate and efficient predictions in shorter time.
Certain Issues in Web Page Prediction, Classification and Clustering in Data ...IJAEMSJORNAL
Nowadays, data mining which is a part of web mining plays a vital role in various applications such as search engines, health care centers for extracting the individual patient details among huge database, analyzing disease based on basic criteria, education system for analyzing their performance level with other system, social networking, E-Commerce and knowledge management etc., which extract the information based on the user query. The issues are time taken to mine the target content or webpage from the search engines, space complexity and predicting the frequent webpage for the next user based on users’ behaviour.
User Navigation Pattern Prediction from Web Log Data: A SurveyIJMER
This paper proposes a survey of Web Page Prediction Techniques. Prefetching of Web page has been widely used to reduce the access latency problem of the Web users. However, if Prefetching of Web page is not accurate and Prefetched web pages are not visited by the users in their accesses, the limited bandwidth of network and services of server will not be used efficiently and may face the problem of access delay. Therefore, it is critical that we need an effective prediction method during prefetching.
The Markov models have been widely used to predict and analyze users navigational behavior. All the
activities of web users have been saved in web log files. The stored users session is used to extract
popular web navigation paths and predict current users next web page visit.
Classification of User & Pattern discovery in WUM: A SurveyIRJET Journal
This document summarizes research on web usage mining techniques. It discusses how web usage mining involves discovering patterns from web server logs to understand how users interact with websites. The document reviews several papers on preprocessing log data, pattern discovery methods like clustering and classification, and classifying users based on patterns. It also provides an overview of the web usage mining process, which typically involves preprocessing, pattern discovery from cleaned logs, and using patterns to classify users. The goal is to help website administrators better understand users and personalize websites.
AN EXTENSIVE LITERATURE SURVEY ON COMPREHENSIVE RESEARCH ACTIVITIES OF WEB US...James Heller
This document summarizes an extensive literature review on web usage mining. It begins by defining web mining and its three types: web content mining, web structure mining, and web usage mining. It then describes the main stages of web usage mining in detail: pre-processing, storage models, pattern discovery, optimization techniques, and pattern analysis. Finally, it discusses the characteristics of web server log data, including that it is unstructured, heterogeneous, distributed, contains different data types and dynamic content, and is voluminous, non-scalable, and incremental in nature.
User Navigation Pattern Prediction from Web Log Data: A SurveyIJMER
This paper proposes a survey of Web Page Prediction Techniques. Prefetching of Web page
has been widely used to reduce the access latency problem of the Web users. However, if Prefetching of
Web page is not accurate and Prefetched web pages are not visited by the users in their accesses, the
limited bandwidth of network and services of server will not be used efficiently and may face the problem
of access delay. Therefore, it is critical that we need an effective prediction method during prefetching.
The Markov models have been widely used to predict and analyze users navigational behavior. All the
activities of web users have been saved in web log files. The stored users session is used to extract
popular web navigation paths and predict current users next web page visit
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Recommendation generation by integrating sequential pattern mining and semanticseSAT Journals
Abstract As the Internet usage keeps increasing, the number of web sites and hence the number of web pages also keeps increasing. A recommendation system can be used to provide personalized web service by suggesting the pages that are likely to be accessed in future. Most of the recommendation systems are based on association rule mining or based on keywords. Using the association rule mining the prediction rate is less as it doesn’t take into account the order of access of the web pages by the users. The recommendation systems that are key-word based provides lesser relevant results. This paper proposes a recommendation system that uses the advantages of sequential pattern mining and semantics over the association rule mining and keyword based systems respectively. Keywords: Sequential Pattern Mining, Taxonomy, Apriori-All, CS-Mine, Semantic, Clustering
BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...ijdkp
Web sequential patterns are important for analyzing and understanding users’ behaviour to improve the
quality of service offered by the World Wide Web. Web Prefetching is one such technique that utilizes
prefetching rules derived through Cyclic Model Analysis of the mined Web sequential patterns. The more
accurate the prediction and more satisfying the results of prefetching if we use a highly efficient and
scalable mining technique such as the Bidirectional Growth based Directed Acyclic Graph. In this paper,
we propose a novel algorithm called Bidirectional Growth based mining Cyclic behavior Analysis of web
sequential Patterns (BGCAP) that effectively combines these strategies to generate prefetching rules in the
form of 2-sequence patterns with Periodicity and threshold of Cyclic Behaviour that can be utilized to
effectively prefetch Web pages, thus reducing the users’ perceived latency. As BGCAP is based on
Bidirectional pattern growth, it performs only (log n+1) levels of recursion for mining n Web sequential
patterns. Our experimental results show that prefetching rules generated using BGCAP is 5-10% faster for
different data sizes and 10-15% faster for a fixed data size than TD-Mine. In addition, BGCAP generates
about 5-15% more prefetching rules than TD-Mine.
This document provides a literature review on methods for predicting user future requests using web usage mining. It discusses several past studies that have used techniques like Markov models, clustering, association rules, and sequential pattern mining to build prediction models from web server log data. The studies aim to reduce user waiting times and server loads by pre-fetching frequently accessed web pages. The document reviews the advantages and disadvantages of different prediction techniques and algorithms discussed in previous research.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Methodologies on user Behavior Analysis and Future Request Prediction in Web ...ijbuiiir1
Web Usage Mining is a kind of web mining which provides knowledge about user navigation behavior and gets the interesting patterns from web. Web usage mining refers to the mechanical invention and scrutiny of patterns in click stream and linked data treated as a consequence of user interactions with web resources on one or more web sites. Identify the need and interest of the user and its useful for upgrade web Sources. Web site developers they can update their web site according to their attention. In this paper discuss about the different types of Methodologies which has been carried out in previous research work for Discovering User Behavior and Predicting the Future Request.
COST-SENSITIVE TOPICAL DATA ACQUISITION FROM THE WEBIJDKP
The cost of acquiring training data instances for induction of data mining models is one of the main concerns in real-world problems. The web is a comprehensive source for many types of data which can be used for data mining tasks. But the distributed and dynamic nature of web dictates the use of solutions which can handle these characteristics. In this paper, we introduce an automatic method for topical data acquisition from the web. We propose a new type of topical crawlers that use a hybrid link context extraction method for topical crawling to acquire on-topic web pages with minimum bandwidth usage and with the lowest cost. The new link context extraction method which is called Block Text Window (BTW), combines a text window method with a block-based method and overcomes challenges of each of these methods using the advantages of the other one. Experimental results show the predominance of BTW in comparison with state of the art automatic topical web data acquisition methods based on standard metrics.
A Study of Pattern Analysis Techniques of Web Usageijbuiiir1
Web mining is the most important application of data mining techniques to extract knowledge from web data including web document, hyperlinks between documents, usage logs of web sites etc. Web mining has been explored to a vast degree and different techniques have been proposed for a huge variety of applications that includes search engine enhancement, optimization of web services, Business Intelligence, B2B and B2C business etc. Most research on web mining has been from a �process-centric� point of view which defined web mining as a sequence of tasks. In this paper, we highlight the significance of studying the evolving nature of the web pattern analysis (WPA). Web usage mining is used to discover interesting user navigation patterns and can be applied to many real-world problems, such as improving web sites/pages. A Web usage mining system performs five major tasks: i) data collection ii) information filtering iii) pattern discovery iv) pattern analysis and visualization techniques, and v) Knowledge Query Mechanism (KQM). Each task is explained in detail and its related technologies are introduced. The web mining research is a converging research area from several research communities, such as database system, information retrieval, information extraction and artificial intelligence. In this paper we implement how web usage mining techniques can be applied for the customization i.e. web visualization
IRJET- Text-based Domain and Image Categorization of Google Search Engine usi...IRJET Journal
This document discusses a proposed system for categorizing search engine results using conceptual clustering. The system analyzes the content of search results to extract relevant concepts, then uses a personalized conceptual clustering algorithm to generate a decision tree of query clusters. This tree can be used to identify categories for web pages and provide topically relevant results to users. The system aims to improve on traditional ranked search results by categorizing results based on the conceptual preferences and interests of individual users.
MULTIFACTOR NAÏVE BAYES CLASSIFICATION FOR THE SLOW LEARNER PREDICTION OVER M...ijcsa
The high school students must be observed for their slow learning or quick learning abilities to provide
them with the best education practices. Such analysis can be perfectly performed over the student
performance data. The high school student data has been obtained from the schools from the various
regions in Punjab, a pivotal state of India. The complete student data and the selective data of almost 1300
students obtained from one school in the regions has been undergone the test using the proposed model in
this paper. The proposed model is based upon the naïve bayes classification model for the data
classification using the multi-factor features obtained from the input dataset. The subject groups have been
divided into the two primary groups: difficult and normal. The classification algorithm has been applied
individually over data grouped in the various subject groups. Both of the early stage classification events
have produced the almost similar results, whereas the results obtained from the classification events over
the averaging factors and the floating factors told the different story than the early stage classification. The
proposed model results have shown that the deep analysis of the data tells the in-depth facts from the input
data. The proposed model can be considered as the effectiv
Similar to A Review on Pattern Discovery Techniques of Web Usage Mining (20)
UNLOCKING HEALTHCARE 4.0: NAVIGATING CRITICAL SUCCESS FACTORS FOR EFFECTIVE I...amsjournal
The Fourth Industrial Revolution is transforming industries, including healthcare, by integrating digital,
physical, and biological technologies. This study examines the integration of 4.0 technologies into
healthcare, identifying success factors and challenges through interviews with 70 stakeholders from 33
countries. Healthcare is evolving significantly, with varied objectives across nations aiming to improve
population health. The study explores stakeholders' perceptions on critical success factors, identifying
challenges such as insufficiently trained personnel, organizational silos, and structural barriers to data
exchange. Facilitators for integration include cost reduction initiatives and interoperability policies.
Technologies like IoT, Big Data, AI, Machine Learning, and robotics enhance diagnostics, treatment
precision, and real-time monitoring, reducing errors and optimizing resource utilization. Automation
improves employee satisfaction and patient care, while Blockchain and telemedicine drive cost reductions.
Successful integration requires skilled professionals and supportive policies, promising efficient resource
use, lower error rates, and accelerated processes, leading to optimized global healthcare outcomes.
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...IJECEIAES
Medical image analysis has witnessed significant advancements with deep learning techniques. In the domain of brain tumor segmentation, the ability to
precisely delineate tumor boundaries from magnetic resonance imaging (MRI)
scans holds profound implications for diagnosis. This study presents an ensemble convolutional neural network (CNN) with transfer learning, integrating
the state-of-the-art Deeplabv3+ architecture with the ResNet18 backbone. The
model is rigorously trained and evaluated, exhibiting remarkable performance
metrics, including an impressive global accuracy of 99.286%, a high-class accuracy of 82.191%, a mean intersection over union (IoU) of 79.900%, a weighted
IoU of 98.620%, and a Boundary F1 (BF) score of 83.303%. Notably, a detailed comparative analysis with existing methods showcases the superiority of
our proposed model. These findings underscore the model’s competence in precise brain tumor localization, underscoring its potential to revolutionize medical
image analysis and enhance healthcare outcomes. This research paves the way
for future exploration and optimization of advanced CNN models in medical
imaging, emphasizing addressing false positives and resource efficiency.
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODELgerogepatton
As digital technology becomes more deeply embedded in power systems, protecting the communication
networks of Smart Grids (SG) has emerged as a critical concern. Distributed Network Protocol 3 (DNP3)
represents a multi-tiered application layer protocol extensively utilized in Supervisory Control and Data
Acquisition (SCADA)-based smart grids to facilitate real-time data gathering and control functionalities.
Robust Intrusion Detection Systems (IDS) are necessary for early threat detection and mitigation because
of the interconnection of these networks, which makes them vulnerable to a variety of cyberattacks. To
solve this issue, this paper develops a hybrid Deep Learning (DL) model specifically designed for intrusion
detection in smart grids. The proposed approach is a combination of the Convolutional Neural Network
(CNN) and the Long-Short-Term Memory algorithms (LSTM). We employed a recent intrusion detection
dataset (DNP3), which focuses on unauthorized commands and Denial of Service (DoS) cyberattacks, to
train and test our model. The results of our experiments show that our CNN-LSTM method is much better
at finding smart grid intrusions than other deep learning algorithms used for classification. In addition,
our proposed approach improves accuracy, precision, recall, and F1 score, achieving a high detection
accuracy rate of 99.50%.
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesChristina Lin
Traditionally, dealing with real-time data pipelines has involved significant overhead, even for straightforward tasks like data transformation or masking. However, in this talk, we’ll venture into the dynamic realm of WebAssembly (WASM) and discover how it can revolutionize the creation of stateless streaming pipelines within a Kafka (Redpanda) broker. These pipelines are adept at managing low-latency, high-data-volume scenarios.
Batteries -Introduction – Types of Batteries – discharging and charging of battery - characteristics of battery –battery rating- various tests on battery- – Primary battery: silver button cell- Secondary battery :Ni-Cd battery-modern battery: lithium ion battery-maintenance of batteries-choices of batteries for electric vehicle applications.
Fuel Cells: Introduction- importance and classification of fuel cells - description, principle, components, applications of fuel cells: H2-O2 fuel cell, alkaline fuel cell, molten carbonate fuel cell and direct methanol fuel cells.
International Conference on NLP, Artificial Intelligence, Machine Learning an...gerogepatton
International Conference on NLP, Artificial Intelligence, Machine Learning and Applications (NLAIM 2024) offers a premier global platform for exchanging insights and findings in the theory, methodology, and applications of NLP, Artificial Intelligence, Machine Learning, and their applications. The conference seeks substantial contributions across all key domains of NLP, Artificial Intelligence, Machine Learning, and their practical applications, aiming to foster both theoretical advancements and real-world implementations. With a focus on facilitating collaboration between researchers and practitioners from academia and industry, the conference serves as a nexus for sharing the latest developments in the field.
A Review on Pattern Discovery Techniques of Web Usage Mining
1. M Rekha Sundari et al. Int. Journal of Engineering Research and Applications www.ijera.com
ISSN : 2248-9622, Vol. 4, Issue 9( Version 4), September 2014, pp.131-136
www.ijera.com 131| P a g e
A Review on Pattern Discovery Techniques of Web Usage Mining
M.Rekha Sundari Y.Srinivas PVGD.Prasad Reddy Dept. of CSE, GITAM University Dept. of IT , GITAM University Dept. of CS&SE, Andhra University Abstract--- In the recent years with the development of Internet technology the growth of World Wide Web exceeded all expectations. A lot of information is available in different formats and retrieving interesting content has become a very difficult task. One possible approach to solve this problem is Web Usage Mining (WUM), the important application of Web Mining. Extracting the hidden knowledge in the log files of a web server, recognizing various interests of web users, discovering customer behavior while at the site are normally referred as the applications of web usage mining. In this paper we provide an updated focused survey on techniques of web usage mining. Keywords: Web Usage Mining, Pattern Discovery, Clustering, Classification.
I. INTRODUCTION
The advancement in technology has brought revolutionary strides for carrying out E-business through World Wide Web (WWW). This explosive increase in the usage of WWW and its capability of storing huge data attracted millions of visitors. As data continue to grow in size and complexity, sophisticated methods to organize the layout of the information become important. This information from the data is used in efficient and effective management of the activities related to e-business, e- education, e-commerce, personalization, Web site design, improvement and management, network traffic analysis, search engine's complexity, and to predict user's actions [40]. Nevertheless, understanding the needs of their users is vital for the owners of the Web sites in order to serve them better. This generated a need to extract useful information from huge amount of data related with web sites. This data is of many types --- the content from web documents like text and graphics, the data from web structure like HTML or XML tags, the data from web log like IP addresses, date or time of access of web pages or the data that is user specific like registration, customer profile etc.., . This user specific data is recorded in the Web access log files of Web servers and usually referred as Web Usage Data (WUD). WUM is that area of Web mining which deals with the application of data mining techniques to reveal interesting knowledge from the WUD.
1.1 Web Usage Mining(WUM)
WUM is that area of Web mining which deals with the application of data mining techniques to reveal interesting knowledge from the WUD. WUM is a three phase process [15] that includes data collection and data preprocessing, pattern discovery and pattern analysis of web data.
A. Data Preprocessing
The success of the pattern analysis phase is highly correlated to how well the data preparation task is executed. It is of utmost importance to ensure, every nuance of this task is taken care of. This process deals with loading of the data, performing accuracy check, putting the data together from disparate sources, transforming the data into required format and finally to structure the data as per the input requirements of some data mining algorithm. This involves many phases like data cleaning, feature extraction, feature reduction, user identification, session identification, page identification, formatting and finally data summarization [8].
B. Pattern Discovery
The preprocessed data is considered for the application of knowledge extraction algorithms based on AI, data mining algorithms, psychology, and information theory. Most of the systems developed for the WUM process have introduced different algorithms for finding the maximal forward reference, large reference sequence, to analyze the traversal path of a user. Different mining algorithms like path analysis, association rules, sequential patterns, clustering and classification are used for effective process of WUM (will be discussed in the subsequent sections). It totally depends on the requirement of the analyst to determine which mining techniques to make use of. When exposed to these algorithms, data in web access logs can be transformed into knowledge to uncover the potential patterns underneath the pre-processed log data and involves analyses of these patterns.
C. Pattern Analysis
RESEARCH ARTICLE OPEN ACCESS
2. M Rekha Sundari et al. Int. Journal of Engineering Research and Applications www.ijera.com
ISSN : 2248-9622, Vol. 4, Issue 9( Version 4), September 2014, pp.131-136
www.ijera.com 132| P a g e
The last phase in WUM process is the analysis of the obtained results in order to distinguish trivial, useless knowledge from knowledge that could be used for Web site modifications, system improvement and/or Web personalization. The common techniques used for pattern analysis are, visualization techniques, OLAP techniques [5], Data & Knowledge Querying, and Usability Analysis.
II. Pattern Discovery Techniques of WUM
The following are the various techniques identified in pattern discovery phase of web usage mining.
2.1 Clustering
Clustering aims at dividing the data set into groups (clusters) where the inter-cluster similarities are minimized while the similarities within each cluster are maximized [44]. In the context of WUM, we can distinguish two cases of clusters, user clusters and page clusters. Web page clustering is performed by grouping pages having similar content. User clustering is performed by grouping users by their similarity in navigational behavior. Clustering can be model-based or distance-based. With model-based clustering [49], the model type is often specified apriori and the model structure can be determined by model selection techniques and parameters estimated using maximum likelihood algorithms, e.g., the Expectation Maximization (EM). Distance-based clustering involves determining a distance measure between pairs of data objects, and then grouping similar objects together into clusters. The most popular distance-based clustering techniques include partitional clustering and hierarchical clustering.
Yang and Balaji [49] proposed hierarchical pattern based clustering algorithm for grouping web transactions and to maximize the objective function in order to achieve good clustering of customer transactions. Sophia et al. [43] emphasized the need to discover similarities in users‟ accessing behavior with respect to the time and locality of their navigational acts. The two tracks of the proposed algorithms define clusters with users that show similar visiting behavior at the same time period, by varying the priority given to page or time of visiting. Raju and Sudhamani [13] proposed a novel partitional based approach for dynamically grouping Web users based on their Web access patterns using Adaptive Resonance Theory1 Neural Network (ART1 NN) clustering algorithm. Cheng et al. [7] proposed a research using both agglomerative and partitional clustering. Loyola et al. [32] proposed a novel methodology for analyzing Web user behavior based on session simulation by using an Ant Colony Optimization algorithm. Ríos et al. [42] utilized two commonly used clustering algorithms, Self Organizing Feature Maps (SOM) and K-medoids to obtain behavior patterns of the users.
Model-based clustering have been shown to be effective for high dimensional text clustering [53].Whereas, hierarchical distance-based clustering proved to be unsuitable for the vast amount of Web data. Partitioned distance-based clustering is disadvantaged by the different distance measures proposed for clustering purpose and defining a good measure is very much data dependent and often requires expert domain knowledge. Despite the variety of clustering approaches that have been used for Web usage mining, Clustering is employed to guide the predictive system and its alone cannot be an appropriate approach for web page prediction [20]. It is merely used to segment data into some homogeneous groups so that a quality model can be built on each group. Another clustering limitation is the ability to evaluate and compare their performance. The reason for this is the lack of an objective evaluation criterion that is independent of the specific application.
2.2 Association Rule mining
As proved my Mobasher et.al [27] Association Rule mining (AR mining) is a major pattern discovery technique. The association rule or frequent item sets mining algorithm was originally proposed by Agarwal et al. [1] for market basket analysis. With its significant applicability, many revised algorithms have been introduced and, AR mining is still a wide research area. Association rule discovery on usage data results in finding groups of pages that are commonly accessed. The applications of association rules are far beyond market basket applications and they have been used in various domains including Web mining.
Mobasher et al. [3] proposed an effective technique for capturing common user profiles based on association-rule discovery and usage-based clustering. They proposed techniques for combining these user profiles, with the current status of an ongoing Web activity to perform real-time personalization, taking into account both the offline tasks and the online process of automatic Web page customization. Przemysław Kazienko [36] presented a new approach by mining indirect association rules, relating them to the direct association rules, joined into one set of complex association rules which is then used for the recommendation of web pages. Yong et al. [50] gave algorithms for mining sequential association rules, based on different sequence and temporal constrains combination. The performance of these algorithms was compared on a real web log dataset by the method of variance analysis. Finally they proved that the sequence constrains, the temporal constrains and the interaction between these two constrains can affect
3. M Rekha Sundari et al. Int. Journal of Engineering Research and Applications www.ijera.com
ISSN : 2248-9622, Vol. 4, Issue 9( Version 4), September 2014, pp.131-136
www.ijera.com 133| P a g e
the precision of prediction. They also concluded that temporal constrains can affect more than sequence constrains. B.Santhosh Kumar and K.V.Rukmani [38] discovered the web usage patterns of websites from the server log files using Apriori algorithm and Frequent Pattern Growth algorithm. The main problem associated with association rule mining is the frequent item problem where the items that occur together with a high frequency will also appear together in many of the resulting rules and thus, resulting in inconsistent predictions. As a consequence, a system cannot give recommendations when the data set is large. In addition to this, AR Algorithms using multiple support thresholds results in better coverage but did not improve accuracy [24]. AR Algorithms where most frequent item sets are stored in data structure, using an algorithm to recognize most suitable items, cause scalability problem and low coverage [33]. AR Algorithms with large transactions would lead to redundant and complex rules [27].
2.3 Sequential pattern Mining
Sequential patterns in Web usage data capture the web page trails that are often visited by users, in the order that they were visited. These are sequences of items that occur in a sufficiently large proportion of (sequence) transactions. The view of web transactions as sequences of pageviews paved way to a number of useful and well-studied models in discovering user navigation patterns. One such approach is to model the navigational activities in the website as a Markov Model (MM): each pageview in this model can be represented as a state and the transition probability between two states can represent the likelihood that a user will navigate from one state to the other. This representation allows for the computation of a number of useful user or site metrics. Lower order markovian model lack accuracy because of its limitation of covering enough browsing history. Higher-order Markov models generally provide a higher prediction ac- curacy but result in much higher model complexity due to the larger number of states. Pitkow et al. [35] proposed all-kth-order Markov models (for coverage improvement) and a new state reduction technique, called longest repeating subsequences to overcome the coverage and space complexity problems (for reducing model size). The use of all-kth-order Markov models generally requires the generation of separate models for each of the k orders. If the model cannot make a prediction using the kth order, it will attempt to make a prediction by incrementally decreasing the model order. This scheme can easily lead to even higher space complexity since it requires the representation of all possible states for each k. Deshpande et al. [10] proposed selective markov models in which they proposed three different techniques to overcome the space complexity of existing all-kth-order Markov models. The proposed schemes involve pruning the model based on criteria such as support, confidence, and error rate. Confidence pruned MM generates all the states irrespective of their frequencies. In particular, the support-pruned MM eliminates all states with low support determined by a minimum frequency threshold. Anderson et al. [2] proposed Relational Markov models, a generalization of Markov models where states can be of different types, with each type described by a different set of variables. This model tends to perform better to existing models when data about all states is available in quantity. Wang et al. and Galassi et al. [16] [47] proposed Hidden markov models. The Hidden Markov Model starts with a finite set of states. Transitions among the states are governed by a set of probabilities (transition probabilities) associated with each state. In a particular state, an outcome or observation can be generated according to a separate probability distribution associated with the state. It is only the outcome, not the state that is visible to an external observer. The states are “hidden” to outside; hence the name Hidden Markov Model. Christopher et al. [29] proposed VOGUE, Variable Order and Gapped HMM for Unstructured Elements relies on a variable gap sequence mining method to extract frequent patterns with different lengths and gaps between elements. These patterns are then used to build a variable order hidden Markov model that explicitly models the gaps.
2.4 Classification
Classification is the task of mapping a data item into one of several predefined classes. The goal of supervised learning is to build a concise model of the distribution of class labels in terms of predictor features. The resulting model is then used to assign class labels to the testing instances. A large number of methods based on the model essence have been developed, and the choice of the method always depends on the task at hand. Under this heading we describe about Decision trees a logical or symbolic technique; Naive Bayesian classifier a statistical technique, k-nearest neighbor classifier an instance based learning technique and a special classification technique Support Vector Machines.
2.4.1 Decision Trees
Murthy [30] provided an overview of work in decision trees and a sample of their usefulness to newcomers in the field of Data Mining. Elomaa [14] presented a comparative study of well-known pruning methods and concluded that there is no single best pruning method. Bruha, [4] proposed that not only post processing but also preprocessing algorithms for decision tree construction can be found . Zidrina Pabarskaite [55] proposes decision
4. M Rekha Sundari et al. Int. Journal of Engineering Research and Applications www.ijera.com
ISSN : 2248-9622, Vol. 4, Issue 9( Version 4), September 2014, pp.131-136
www.ijera.com 134| P a g e
trees for web user behaviour analysis. This analysis predicts user future actions and typical pages that lead to browsing termination. Using Decision tree package C4.5, Olcay and Onur [31] show how to parallelize C4.5 algorithm in three ways: (i) feature based, (ii) node based (iii) data based manner. To sum up, one of the most useful characteristics of decision trees is their comprehensibility. Decision trees tend to perform better when dealing with discrete/categorical features.
2.4.2 Naïve Bayes Classifier
Bayesian networks are the most well known representative of statistical learning algorithms. The major advantage of the Naive Bayes classifier is its short computational time for training. In addition, since the model has the form of a product, it can be converted into a sum through the use of logarithms – with significant consequent computational advantages. Domingos & Pazzani [12] performed a large-scale comparison of the naive Bayes classifier with state-of-the-art algorithms for decision tree induction, instance-based learning, and rule induction on standard benchmark datasets, and found it to be sometimes superior to the other learning schemes, even on datasets with substantial feature dependencies. Deng et al. [22] proposed spy Naïve Bayes to identify the user preference pairs generated from click through data. Santra and Jayasudha [39] used Naive Bayesian Classification algorithm for classifying the interested users. They measured the performance of this algorithm on web log data with session based timing, page visits, repeated user profiling, and page depth to the site length and concluded that the memory and time taken to classify the web log files are more efficient when compared to existing C4.5 algorithm.
2.4.3 k-Nearest Neighbour
k-Nearest Neighbour (kNN) is based on the principle that the instances within a dataset will generally exist in close proximity to other instances that have similar properties. As kNN does not make any assumptions on the underlying data distribution and does not use the training data points to do any generalization, it is called as non parametric lazy learning algorithm. Guo et al. [17] proposed a novel kNN type method for classification that is aimed at overcoming the drawback of its dependency on the selection of a “good value” for k. Yu & Liu [51] addressed the problem of determining which of the available input features should be used in modeling via feature selection because it could improve the classification accuracy and scale down the required classification time.
2.4.4 Support Vector Machines
Support Vector Machines (SVMs) are the newest supervised machine learning technique [46]. Burges, [6] gave an excellent survey of SVMs, and a more recent book about SVMs is by Cristianini & Shawe-Taylor [9]. Yuh-Jye Lee and O.L. Mangasarian [52] presented a support vector machine for pattern classification using a completely arbitrary kernel. Sung-Hae Jun [45] used SVMs to analyze web log data and estimated the dependency between the web pages overcoming the difficulty of sparsity. Satoshi Mizuno [41] proposed a method that creates user‟s profile from browsing history using Term Frequency Inverse Document Frequency and then classifies the URL‟s of the browsing history using SVM.
2.5 Mixture Models
Mixture Models play an important role in Classification. In order to identify the proper model to classify the data, in these models we assume that the behavior of each user in the data set is generated independently, and the behavior is generated by a mixture model with K components. In a mixture model, we are concerned with (1) the number of components; (2) the probability distribution used to assign users to the various clusters, and (3) the parameters of each model component. Once the model is estimated, we can use it to assign each user to a cluster or fractionally to the set of clusters. Yanzan Kevin Zhou and Bamshad Mobasher [48] proposed an approach for Web user segmentation and online behavior analysis based on a mixture of factor analyzers. In this framework, they modelled users‟ shared interests as a set of common latent factors extracted through factor analysis, and discovered user segments based on the posterior component distribution of a finite mixture model. This measured the relationships between users‟ unobserved conceptual interests and their observed navigational behavior in a principled probabilistic manner. Attenberg et al. [19] developed a generative model to mimic trends in observed user activity using a mixture of pareto distributions. Mihajlo Grbovic et al. [26] proposed, time- and memory- efficient algorithm for learning label preferences based on the Gaussian Mixture Model (GMM), this model turned to be attractive because of an intuitively clear learning process and ease of implementation. Rekha et al. [37] proposed Adaptive Gaussian Mixture model for user behavior modeling. The developed method as shown a drastic improvement in identifying the navigational pattern of user compared to GMM.
2.6 Collaborative Filtering (CF)
Collaborative filtering (CF) is a technique utilized primarily to predict individuals' preferences, has its origin in information filtering. This technique guides an active user depending on the preferences shared by like users. Once a database of preferences of like users is accumulated, a similarity measure is
5. M Rekha Sundari et al. Int. Journal of Engineering Research and Applications www.ijera.com
ISSN : 2248-9622, Vol. 4, Issue 9( Version 4), September 2014, pp.131-136
www.ijera.com 135| P a g e
used to identify individuals with similar past preferences with the active user. A preference function is applied on the database to guide or recommend the active user [18]. This technique is easy to comprehend and implement, but requires a large sample to make meaningful recommendations. Erroneous recommendations result when close neighbors don't exist. Content information and customer profile or behavior information is not used for making recommendations. As database size increases, the recommendation computation becomes computationally more intensive. These also suffer from a fundamental problem, called sparsity problem. Since the set of all possible available items in a system is very large, most users may have rated very few items, and, hence, it is difficult to find the active user‟s neighborhood with high similarity. As a result the accuracy of the recommendations may be poor [34]. To overcome the above disadvantages classification and prediction had its application in the web domain of collaborative filtering. Lin et al. [23] proposed a collaborative recommendation system using association rules. Zhong hang Xia et al. [54] proposed a collaborative filtering system with SVM. Koji Miyahara and Michael J. Pazzani [21] proposed a collaborative filtering system with Bayesian classifier. Miha Grcar et al. [25] presented experimental results of confronting the k-Nearest Neighbor (kNN) algorithm with SVM in the collaborative filtering framework using datasets with different properties. Dhruv Gupta et al. [11] emphasized on a new, principal component analysis and clustering-based linear time collaborative filtering algorithm for efficient and effective personalized information retrieval.
III. CHALLENGES
In order to navigate the user‟s behavioral patterns, the data stored in the web log is of crucial importance. This data generally will be in unstructured format and hence to analyze this data efficient methodologies are to be developed. The literature developed in this regard exhibits inconsistency, incorrect and missing values. Therefore advanced methodologies that can navigate the data more efficiently by minimizing the inconsistent data to retrieve the webpage‟s of users interests is the concern of the day. Hence efficient clustering and classification algorithms together with effective preprocessing techniques are to be developed.
IV. CONCLUSION
This paper gives an insight into the possible data mining techniques with Web usage data for achieving a synergetic effect of Web usage mining. Association rules are used to discover pages that are visited together quite often. Discovering sequential patterns from web access logs can be used for predicting future visits of the users. Clustering discovers groups of users or pages, based on their similarities. Classification classifies the new user into one of the predefined groups based on their maximum likelihood. It is hard, if not impossible, to declare that one data mining algorithm is the best in general, because the possible outcomes of WUM process always depend on the problem in hand. References
[1] Agarwal R. and Srikant R., "Fast algorithms for mining association rules", VLDB’94, Chile, pp. 487–499, 1994.
[2] Anderson C., Domingos P., Weld D. S., "Relational Markov Models and their Application to Adaptive Web Navigation", Proceedings of the 8th ACM SIGKDD Conference, Canada, August 2002.
[3] Bamshad Mobasher, Robert Cooley, Jaideep Srivastava," Creating Adaptive Web Sites Through Usage-Based Clustering of URLs", proceedings of the 1999 workshop on knowledge and data engineering, pp 19, 1999.
[4] Bruha I., "From machine learning to knowledge discovery: Survey of preprocessing and postprocessing” Intelligent Data Analysis, Vol. 4, pp. 363-374, 2000.
[5] Buchner A. and Mulvenna M. D., "Discovering Internet Marketing Intelligence through Online Analytical Web Usage Mining", Proceedings of the ACM SIGMOD, Intl.Conf. on Management of Data (SIGMOD’99), pp. 54– 61, 1999.
[6] Burges C., "A tutorial on support vector machines for pattern recognition”, Data Mining and Knowledge Discovery, Vol. 2, pp. 1-47, 1998.
[7] Cheng D., Kannan R., Vempala S. and Wang G., "A divide-and-merge methodology for clustering", ACM SIGMOD, pp. 196–212, 2005.
[8] Cooley R. , Mobasher B., and Srivastava J. , "Data Preparation for Mining World Wide Web Browsing Patterns", Knowledge and Information Systems, vol. 1(1), pp. 5–32, 1999.
[9] Cristianini N. and Shawe-Taylor J., "An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods", Cambridge University Press, Cambridge, 2000.
[10] Deshpande M., Karypis G., "Selective Markov Models for Predicting Web-Page Accesses", Proceedings of the 1st SIAM International Conference on Data Mining, 2004.
[11] Dhruv Gupta, Mark Digiovanni, Hiro Narita, and Ken Goldberg, "Jester 2.0 : Evaluation of a New Linear Time Collaborative Filtering Algorithm", SIGIR „99 Berkley, CA, USA ,ACM, 1999.
[12] Domingos P. and Pazzani M. , "On the optimality of the simple Bayesian classifier under zero-one loss”. Machine Learning, pp.103-130, 1997.
[13] Dr. G T Raju and Dr. M V Sudhamani, "A Novel Approach for Extraction of Cluster Patterns from Web Usage Data and its Performance Analysis", IEEE, 2011
[14] Elomaa T., "The biases of decision tree pruning strategies", Lecture Notes in Computer Science 1642. Springer, pp. 63-74, 1999.
[15] Fayyad U. M. , Piatetsky-Shapiro G., and Smyth P. "From Data Mining to Knowledge Discovery: An Overview", Advances in Knowledge Discovery and Data Mining, AAAI/MIT Press, pp. 1–34, 1996.
[16] Galassi U., Botta M., and Giordana A., "Hierarchical hidden markov models for user process profile learning", Fundamenta Informatica 78, vol. 4, pp. 487–505, 2007.
6. M Rekha Sundari et al. Int. Journal of Engineering Research and Applications www.ijera.com
ISSN : 2248-9622, Vol. 4, Issue 9( Version 4), September 2014, pp.131-136
www.ijera.com 136| P a g e
[17] Guo G., Wang H., Bell D., Bi Y., and Greer K. "KNN Model-Based Approach in Classification", Lecture Notes in Computer Science, Vol 2888, Pages 986 – 996, 2003.
[18] http://www.fico.com/en/Communities/AnalyticTechnologies/Pages/CollaborativeFiltering.aspx.
[19] Josh Attenberg, Sandeep Pandey, Torsten Suel, "Modeling and Predicting User Behavior in Sponsored Search" KDD‟09, June 28–July 1, ACM 978-1-60558- 495-9/09/06, 2009.
[20] Kim D., Adam N., Alturi V., Bieber M. and Yesha, Y., "A click stream based collaborative filtering personalization model: Towards a better performance",WIDM ‟04, pp. 88–95, 2004.
[21] Koji Miyahara and Michael J. Pazzani, "Collaborative Filtering with the Simple Bayesian Classifier", Proceedings of the 6th Pacific Rim International conference on Artificial intelligence, pp. 679- 689 ,Springer-Verlag Berlin, Heidelberg, 2000.
[22] Lin et al., "Spying Out Real User Preferences for Metasearch Engine Personalization", Proceedings of the WEBKDD’04 in conjunction with KDD’04, August 22, 2004.
[23] Lin W., Alvarez S. A., and Ruiz C., "Efficient Adaptive- Support Association Rule Mining for Recommender Systems”, Data Mining and Knowledge Discovery, vol. 6, pp. 83–105, 2002.
[24] Liu B., Hsu W. and Ma Y., "Mining association rules with multiple minimum support", KDD, San Diego, pp. 337– 341, 1999.
[25] Miha Grcar, Miha Grcar, and Dunja Mladenic, "kNN Versus SVM in the Collaborative Filtering Framework", WebKDD ‟05, August 21, Chicago, Illinois, ACM 1- 59593-214-3, 2005.
[26] Mihajlo Grbovic and Nemanja Djuric and Slobodan Vucetic, "Learning from Pairwise Preference Data using Gaussian Mixture Model", Preference Learning Workshop, European Conference on Artificial Intelligence, 2012.
[27] Mobasher B., Dai H., Luo T. and Nakagawa M., "Discovery of aggregate usage profiles for web personalization", WebKDD‟00, USA pp. 61–82, 2000.
[28] Mobasher B., Dai H., Luo T. and Nakagawa M., "Effective personalization based on association rule discovery from web usage data", WIDM’01, USA, 2001.
[29] Mohammed J. Zaki Christopher ,D. Carothers and Boleslaw K. Szymanski, “VOGUE: A Variable Order Hidden Markov Model with Duration based on Frequent Sequence Mining “, ACM Transactions on Knowledge Discovery from Data, vol.4(1),article 5, January, 2010.
[30] Murthy, "Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey”, Data Mining and Knowledge Discovery pp. 345–389, 1998.
[31] Olcay Taner Yıldız and Onur Dikmen , "Parallel univariate decision trees”, Pattern Recognition Letters, Vol. 28 , Issue. 7, pp. 825-832, May 2007.
[32] Pablo Loyola, Pablo E. Rom´an and Juan D. Vel´asquez, "Clustering-Based Learning Approach for Ant Colony Optimization Model to Simulate Web User Behavior",IEEE, 2011.
[33] Park J. S., Philip S. Y. and Chen M. S., "Mining association rules with adjustable accuracy", CIKM’97, pp. 151–160, 1997.
[34] Paul et al., "GroupLens: an open architecture for collaborative filtering of netnews", CSCW '94 Proceedings of the 1994 ACM conference on Computer supported cooperative work, pp 175-186 , 1994.
[35] Pitkow J. and Pirolli P., "Mining Longest Repeating Subsequences to Predict WWW Surfing", Proceedings of the 2nd USENIX Symposium on Internet Technologies and Systems, 1999.
[36] Przemyslaw Kazienko, "Mining Indirect Association Rules For Web Recommendation", Int. J. Appl. Math. Comput. Sci., Vol. 19, No. 1, pp. 165–186, 2009.
[37] Rekha Sundari M., Prasad Reddy PVGD and Srinivas y., "User Behavior Modeling based on Adaptive Gaussian Mixture Model", International Journal of Computer Applications 60(4):1-3, December 2012.
[38] Santhosh Kumar B. and Rukmani K.V., "Implementation of Web Usage Mining Using APRIORI and FP Growth Algorithms", International Journal of Advanced Networking and Applications, Vol.01, Issue.06, pp. 400- 404, 2010.
[39] Santra A. K. and Jayasudha S., "Classification of Web Log Data to Identify Interested Users Using Naïve Bayesian Classification", IJCSI International Journal of Computer Science Issues, Vol. 9, Issue 1, No 2, January 2012.
[40] Sasa Bosnjak ,Mirjana Maric, and Zita Bosnjak, "The Role of Web Usage Mining in Web applications Evaluation", Management Information Systems,Vol. 5, No. 1, pp. 031-036, 2010.
[41] Satoshi Mizuno, "Personalized Web Search System with Categorization using SVM", University of Aizu, Graduation Thesis,March, 2009.
[42] Sebastian A. Rios, Roberto A. Silva, and Felipe Aguilera., "A dissimilarity measure for automate moderation in online social networks", WI&C, 3, ACM, 2012.
[43] Sophia G. Petridou, Vassiliki A. Koutsonikola, Athena I. Vakali, and Georgios Papadimitriou, "Time Aware Web Users Clustering" 1041-4347, IEEE, 2007.
[44] Srivastava J., Cooley R., Deshpande M. and Tan P., "Web usage mining: Discovery and applications of usage patterns from web data", SIGDD Explorations, pp. 12–23, 2000.
[45] Sung-Hae Jun, "Web usage mining using support vector machine", WANN‟05 Proceedings of the 8th international conference on Artificial Neural Networks: computational Intelligence and Bioinspired Systems, pp. 349-356, 2005.
[46] Vapnik. V, "The Nature of Statistical Learning Theory", Springer Verlag, 1995.
[47] Wang et al., "Mining complex time-series data by learning markovian models", 6th IEEE International Conference on Data Mining, 2006.
[48] Yanzan Kevin Zhou and Bamshad Mobasher, "Web User Segmentation Based on a Mixture of Factor Analyzers EC-Web 2006", LNCS 4082, pp. 11–20, 2006.
[49] Yinghui Yang and Balaji Padmanabhan, "A Hierarchical Pattern-Based Clustering Algorithm For Grouping Web Transactions", IEEE Transactions On Knowledge And Data Engineering, Vol. 17, No. 9, September 2005.
[50] Yong Wang, Zhanhuai Li and Yang Zhang, "Mining Sequential Association-Rule For Improving Web Document Prediction", ICCIMA‟05, pp. 146–151, 2005.
[51] Yu L. and Liu H. , "Efficient Feature Selection via Analysis of Relevance and Redundancy", JMLR, 5(Oct):1205-1224, 2004.
[52] Yuh-Jye Lee and O.L. Mangasarian, "SSVM: A Smooth Support Vector Machine for Classification”, Computational Optimization and Applications, pp 5–22, 2001.
[53] Zhong S. and Ghosh J., "A unified framework for model- based clustering", Machine Learning Research 4, 1001– 1037, 2003.
[54] Zhonghang Xia Yulin Dong and Guangming Xing, "Support Vector Machines For Collaborative Filtering", ACM SE’06,Melbourne, Florida, USA., pp 10-12, March 2006.
[55] Zidrina Pabarskaite, "Decision trees for web log mining", Journal
[56] Intelligent Data Analysis, Vol. 7, Issue. 2, pp. 141 - 154, April 2003.