This paper proposes a survey of Web Page Prediction Techniques. Prefetching of Web page has been widely used to reduce the access latency problem of the Web users. However, if Prefetching of Web page is not accurate and Prefetched web pages are not visited by the users in their accesses, the limited bandwidth of network and services of server will not be used efficiently and may face the problem of access delay. Therefore, it is critical that we need an effective prediction method during prefetching.
The Markov models have been widely used to predict and analyze users navigational behavior. All the
activities of web users have been saved in web log files. The stored users session is used to extract
popular web navigation paths and predict current users next web page visit.
This document provides a literature review on methods for predicting user future requests using web usage mining. It discusses several past studies that have used techniques like Markov models, clustering, association rules, and sequential pattern mining to build prediction models from web server log data. The studies aim to reduce user waiting times and server loads by pre-fetching frequently accessed web pages. The document reviews the advantages and disadvantages of different prediction techniques and algorithms discussed in previous research.
Integrating vague association mining with markov modelijsc
The increasing demand of World Wide Web raises the need of predicting the user’s web page request. The
most widely used approach to predict the web pages is the pattern discovery process of Web usage mining.
This process involves inevitability of many techniques like Markov model, association rules and clustering.
Fuzzy theory with different techniques has been introduced for the better results. Our focus is on Markov
models. This paper is introducing the vague Rules with Markov models for more accuracy using the vague
set theory.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Farthest first clustering in links reorganizationIJwest
Website can be easily design but to efficient user navigation is not a easy task since user behavior is keep changing and developer view is quite different from what user wants, so to improve navigation one way is reorganization of website structure. For reorganization here proposed strategy is farthest first traversal clustering algorithm perform clustering on two numeric parameters and for finding frequent traversal path of user Apriori algorithm is used. Our aim is to perform reorganization with fewer changes in website structure.
The document discusses various techniques for web crawling and focused web crawling. It describes the functions of web crawlers including web content mining, web structure mining, and web usage mining. It also discusses different types of crawlers and compares algorithms for focused crawling such as decision trees, neural networks, and naive bayes. The goal of focused crawling is to improve precision and download only relevant pages through relevancy prediction.
A Survey on: Utilizing of Different Features in Web Behavior PredictionEditor IJMTER
As the web user increases day by day, there are many websites which have a large
number of visitors at the same instant. So handing of these user required different technique. Out of
these requirements one emerging field is next page prediction, where as per the user navigation
pattern different features has been studied and predict the next page for the user. By this overall web
server response time is reduce. In this paper a detailed study of the different researcher paper has
shown, there techniques outcomes and list of features utilization such as web structure, web log, web
content.
An Efficient Hybrid Successive Markov Model for Predicting Web User Usage Beh...Waqas Tariq
With the continued growth and proliferation of Web services and Web based information systems, the volumes of user data have reached astronomical proportions. Analyzing such data using Web Usage Mining can help to determine the visiting interests or needs of the web user. As web log is incremental in nature, it becomes a crucial issue to predict exactly the ways how users browse websites. It is necessary for web miners to use predictive mining techniques to filter the unwanted categories for reducing the operational scope. The first-order Markov model has low accuracy in achieving right predictions, which is why extensions to higher order models are necessary. All higher order Markov model holds the promise of achieving higher prediction accuracies, improved coverage than any single-order Markov model but holds high state space complexity. Hence a Hybrid Markov Model is required to improve the operation performance and prediction accuracy significantly. The present paper introduces An Efficient Hybrid Successive Markov Prediction Model, HSMP. The HSMP model is initially predicts the possible wanted categories using Relevance factor, which can be used to infer the users’ browsing behavior between web categories. Then predict the pages in predicted categories using techniques for intelligently combining different order Markov models so that the resulting model has low state complexity, improved prediction accuracy and retains the coverage of the all higher order Markov model. These techniques eliminates low support states, evaluates the probability distribution and estimates the error associated with each state without affecting the overall accuracy as well as protection of the resulting model. To validate the proposed prediction model, several experiments were conducted and results proven this are claimed in this paper.
A NEW IMPROVED WEIGHTED ASSOCIATION RULE MINING WITH DYNAMIC PROGRAMMING APPR...cscpconf
With the rapid development of Internet, Web search has been taken an important role in our
ordinary life. In web search, mining frequent patterns in large database is a major research area. Due to increase of user activities on web, web-searching methods, to predict the nextrequest of user visits in web pages plays a major role. Web searching methods are helpful to provide quality results, timely answer and also offer a customized navigation. In web search, Association rule mining is an important data analysis method to discover associated web pages. Most of the researchers implemented association mining using Apriori algorithm with binary representation. The problem of this approach is not address the issue like the navigation order of web pages. To overcome this problem researchers proposed a weighted Apriori to maintain navigation order but unable to produce optimal results. With the goal of a most favorable result we proposed a novel approach which combines weighted Apriori and dynamic programming. The experimental result shows that this approach maintains the navigation order of web pages and achieves a best solution. The proposed technique enhances the web site effectiveness, increases the user browsing knowledge, improves the prediction accuracy and decreases the computational complexities.
This document provides a literature review on methods for predicting user future requests using web usage mining. It discusses several past studies that have used techniques like Markov models, clustering, association rules, and sequential pattern mining to build prediction models from web server log data. The studies aim to reduce user waiting times and server loads by pre-fetching frequently accessed web pages. The document reviews the advantages and disadvantages of different prediction techniques and algorithms discussed in previous research.
Integrating vague association mining with markov modelijsc
The increasing demand of World Wide Web raises the need of predicting the user’s web page request. The
most widely used approach to predict the web pages is the pattern discovery process of Web usage mining.
This process involves inevitability of many techniques like Markov model, association rules and clustering.
Fuzzy theory with different techniques has been introduced for the better results. Our focus is on Markov
models. This paper is introducing the vague Rules with Markov models for more accuracy using the vague
set theory.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Farthest first clustering in links reorganizationIJwest
Website can be easily design but to efficient user navigation is not a easy task since user behavior is keep changing and developer view is quite different from what user wants, so to improve navigation one way is reorganization of website structure. For reorganization here proposed strategy is farthest first traversal clustering algorithm perform clustering on two numeric parameters and for finding frequent traversal path of user Apriori algorithm is used. Our aim is to perform reorganization with fewer changes in website structure.
The document discusses various techniques for web crawling and focused web crawling. It describes the functions of web crawlers including web content mining, web structure mining, and web usage mining. It also discusses different types of crawlers and compares algorithms for focused crawling such as decision trees, neural networks, and naive bayes. The goal of focused crawling is to improve precision and download only relevant pages through relevancy prediction.
A Survey on: Utilizing of Different Features in Web Behavior PredictionEditor IJMTER
As the web user increases day by day, there are many websites which have a large
number of visitors at the same instant. So handing of these user required different technique. Out of
these requirements one emerging field is next page prediction, where as per the user navigation
pattern different features has been studied and predict the next page for the user. By this overall web
server response time is reduce. In this paper a detailed study of the different researcher paper has
shown, there techniques outcomes and list of features utilization such as web structure, web log, web
content.
An Efficient Hybrid Successive Markov Model for Predicting Web User Usage Beh...Waqas Tariq
With the continued growth and proliferation of Web services and Web based information systems, the volumes of user data have reached astronomical proportions. Analyzing such data using Web Usage Mining can help to determine the visiting interests or needs of the web user. As web log is incremental in nature, it becomes a crucial issue to predict exactly the ways how users browse websites. It is necessary for web miners to use predictive mining techniques to filter the unwanted categories for reducing the operational scope. The first-order Markov model has low accuracy in achieving right predictions, which is why extensions to higher order models are necessary. All higher order Markov model holds the promise of achieving higher prediction accuracies, improved coverage than any single-order Markov model but holds high state space complexity. Hence a Hybrid Markov Model is required to improve the operation performance and prediction accuracy significantly. The present paper introduces An Efficient Hybrid Successive Markov Prediction Model, HSMP. The HSMP model is initially predicts the possible wanted categories using Relevance factor, which can be used to infer the users’ browsing behavior between web categories. Then predict the pages in predicted categories using techniques for intelligently combining different order Markov models so that the resulting model has low state complexity, improved prediction accuracy and retains the coverage of the all higher order Markov model. These techniques eliminates low support states, evaluates the probability distribution and estimates the error associated with each state without affecting the overall accuracy as well as protection of the resulting model. To validate the proposed prediction model, several experiments were conducted and results proven this are claimed in this paper.
A NEW IMPROVED WEIGHTED ASSOCIATION RULE MINING WITH DYNAMIC PROGRAMMING APPR...cscpconf
With the rapid development of Internet, Web search has been taken an important role in our
ordinary life. In web search, mining frequent patterns in large database is a major research area. Due to increase of user activities on web, web-searching methods, to predict the nextrequest of user visits in web pages plays a major role. Web searching methods are helpful to provide quality results, timely answer and also offer a customized navigation. In web search, Association rule mining is an important data analysis method to discover associated web pages. Most of the researchers implemented association mining using Apriori algorithm with binary representation. The problem of this approach is not address the issue like the navigation order of web pages. To overcome this problem researchers proposed a weighted Apriori to maintain navigation order but unable to produce optimal results. With the goal of a most favorable result we proposed a novel approach which combines weighted Apriori and dynamic programming. The experimental result shows that this approach maintains the navigation order of web pages and achieves a best solution. The proposed technique enhances the web site effectiveness, increases the user browsing knowledge, improves the prediction accuracy and decreases the computational complexities.
Recommendation generation by integrating sequential pattern mining and semanticseSAT Journals
Abstract As the Internet usage keeps increasing, the number of web sites and hence the number of web pages also keeps increasing. A recommendation system can be used to provide personalized web service by suggesting the pages that are likely to be accessed in future. Most of the recommendation systems are based on association rule mining or based on keywords. Using the association rule mining the prediction rate is less as it doesn’t take into account the order of access of the web pages by the users. The recommendation systems that are key-word based provides lesser relevant results. This paper proposes a recommendation system that uses the advantages of sequential pattern mining and semantics over the association rule mining and keyword based systems respectively. Keywords: Sequential Pattern Mining, Taxonomy, Apriori-All, CS-Mine, Semantic, Clustering
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
DESIGN AND IMPLEMENTATION OF CARPOOL DATA ACQUISITION PROGRAM BASED ON WEB CR...ijmech
Now the public traffics make the life more and more convenient. The amount of vehicles in large or medium
sized cities is also in the rapid growth. In order to take full advantage of social resources and protect the
environment, regional end-to-end public transport services are established by analyzing online travel data.
The usage of computer programs for processing of the web page is necessary for accessing to a large
number of the carpool data. In the paper, web crawlers are designed to capture the travel data from
several large service sites. In order to maximize the access to traffic data, a breadth-first algorithm is used.
The carpool data will be saved in a structured form. Additionally, the paper has provided a convenient
method of data collecting to the program.
A vague improved markov model approach for web page predictionIJCSES Journal
Today most of the information in all areas is available over the web. It increases the web utilization as
well as attracts the interest of researchers to improve the effectiveness of web access and web utilization.
As the number of web clients gets increased, the bandwidth sharing is performed that decreases the web
access efficiency. Web page prefetching improves the effectiveness of web access by availing the next
required web page before the user demand. It is an intelligent predictive mining that analyze the user web
access history and predict the next page. In this work, vague improved markov model is presented to
perform the prediction. In this work, vague rules are suggested to perform the pruning at different levels of
markov model. Once the prediction table is generated, the association mining will be implemented to
identify the most effective next page. In this paper, an integrated model is suggested to improve the
prediction accuracy and effectiveness.
Implemenation of Enhancing Information Retrieval Using Integration of Invisib...iosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
3 iaetsd semantic web page recommender systemIaetsd Iaetsd
This document discusses a semantic web-page recommender system that aims to improve upon traditional recommender systems. It proposes two novel knowledge representation models: 1) a semantic network that represents domain knowledge through terms, webpages, and relationships automatically constructed from website data, and 2) a conceptual prediction model that integrates semantic knowledge with web usage data to create a weighted semantic network for making recommendations. The system seeks to overcome limitations of prior systems by automating knowledge base construction and addressing the "new-item problem" through incorporation of semantic information. Evaluation shows the proposed approach yields better performance than existing web usage-based recommender systems.
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
This document describes a novel parallel domain focused crawler (PDFC) that aims to reduce the load on networks by crawling only web pages relevant to specific domains and skipping pages that have not been modified. The proposed system uses mobile crawlers that stay at remote sites to monitor page changes and only send compressed, modified pages back to the search engine. It is estimated that this approach can reduce network load by up to 40% by avoiding downloading of unchanged pages. The key components of PDFC include modules for URL allocation, comparing page changes over time, and estimating frequency of changes to prioritize recrawling. Experimental results suggest the system is effective at preserving network resources.
The Application Programming Interface restricts
the types of queries that the Web service can answer. For
instance, a Web service might provide a method that returns the
books of a given author in fast manner, but it might not provide a
method that returns the authors of a given book. If the user asks
for the author of some specific book, then the Web service cannot
be called – even though the underlying database might have the
preferred piece of information, this scenario is called asymmetry.
This asymmetry is particularly problematic if the service is used
in a Web service orchestration system. In this survey, we propose
to use on-the-fly information extraction (IE).IE used to collect
values, and then the value can be used as parameter Bindings for
the Web service. This survey shows how the information
extraction can be integrated into a Web service orchestration
system. The proposed approach is fully implemented in a
prototype called Search Using Services and Information
Extraction (SUSIE). Real-life data and services are used to
demonstrate the practical viability and good performance of our
approach
A Review on Pattern Discovery Techniques of Web Usage MiningIJERA Editor
In the recent years with the development of Internet technology the growth of World Wide Web exceeded all expectations. A lot of information is available in different formats and retrieving interesting content has become a very difficult task. One possible approach to solve this problem is Web Usage Mining (WUM), the important application of Web Mining. Extracting the hidden knowledge in the log files of a web server, recognizing various interests of web users, discovering customer behavior while at the site are normally referred as the applications of web usage mining. In this paper we provide an updated focused survey on techniques of web usage mining.
Identifying the Number of Visitors to improve Website Usability from Educatio...Editor IJCATR
Web usage mining deals with understanding the Visitor’s behaviour with a Website. It helps in understanding the concerns
such as present and future probability of every website user, relationship between behaviour and website usability. It has different
branches such as web content mining, web structure and web usage mining. The focus of this paper is on web mining usage patterns of
an educational institution web log data. There are three types of web related log data namely web access log, error log and proxy log
data. In this paper web access log data has been used as dataset because the web access log data is the typical source of navigational
behaviour of the website visitor. The study of web server log analysis is helpful in applying the web mining techniques.
International Journal of Engineering Research and DevelopmentIJERD Editor
Electrical, Electronics and Computer Engineering,
Information Engineering and Technology,
Mechanical, Industrial and Manufacturing Engineering,
Automation and Mechatronics Engineering,
Material and Chemical Engineering,
Civil and Architecture Engineering,
Biotechnology and Bio Engineering,
Environmental Engineering,
Petroleum and Mining Engineering,
Marine and Agriculture engineering,
Aerospace Engineering.
Web Page Recommendation Using Web MiningIJERA Editor
On World Wide Web various kind of content are generated in huge amount, so to give relevant result to user web recommendation become important part of web application. On web different kind of web recommendation are made available to user every day that includes Image, Video, Audio, query suggestion and web page. In this paper we are aiming at providing framework for web page recommendation. 1) First we describe the basics of web mining, types of web mining. 2) Details of each web mining technique.3)We propose the architecture for the personalized web page recommendation.
This document proposes a new technique to enhance the learning capabilities and reduce the computation intensity of a competitive learning multi-layered neural network using the K-means clustering algorithm. The proposed model uses a multi-layered network architecture with backpropagation learning to analyze web log data. Data preprocessing steps like cleaning, user identification, and transaction identification are applied to prepare the enterprise proxy log data for analysis. The proposed framework aims to discover useful patterns from web log data through a combination of K-means clustering and a feedforward neural network.
WEB LOG PREPROCESSING BASED ON PARTIAL ANCESTRAL GRAPH TECHNIQUE FOR SESSION ...cscpconf
Web access log analysis is to analyze the patterns of web site usage and the features of users behavior. It is
the fact that the normal Log data is very noisy and unclear and it is vital to preprocess the log data for
efficient web usage mining process. Preprocessing comprises of three phases which includes data cleaning,
user identification and session construction. Session construction is very vital and numerous real world
problems can be modeled as traversals on graph and mining from these traversals would provide the
requirement for preprocessing phase. On the other hand, the traversals on unweighted graph have been
taken into consideration in existing works. This paper oversimplifies this to the case where vertices of
graph are given weights to reflect their significance. The proposed method constructs sessions as a Partial
Ancestral Graph which contains pages with calculated weights. This will help site administrators to find
the interesting pages for users and to redesign their web pages. After weighting each page according to
browsing time a PAG structure is constructed for each user session. Existing system in which there is a
problem of learning with the latent variables of the data and the problem can be overcome by the proposed
method.
This document summarizes previous work on content extraction from web pages and proposes a new approach. It discusses existing methods that use techniques like entropy analysis, DOM trees, clustering, and ratios of text, links and tags. The proposed approach combines word to leaf ratio with text link ratio and link text ratio to identify informative nodes in the DOM tree. It calculates weights and relative positions of nodes to select the most informative content. The method will be tested on different website types and compared to existing approaches.
The document discusses optimizing search performance in unstructured peer-to-peer networks by improving the overlay topology. Existing approaches organize peers based on similarity but do not provide performance guarantees. The proposed approach presents a novel overlay construction algorithm that exploits the power-law file sharing pattern exhibited in P2P networks. It provides rigorous guarantees including searching in O(logN) hops with high probability and progressively exploiting peer similarity. Simulations show it outperforms competing algorithms in reducing query hops, improving success ratio, and minimizing messaging overhead.
Repairing of Concrete by Using Polymer-Mortar CompositesIJMER
This document discusses research into using polymer-mortar composites for concrete repair. Two sets of mixtures were prepared with different cement-sand ratios and percentages of epoxy polymer added. Tests were conducted to determine the composites' compressive, flexural, and bonding strengths. The highest compressive strength was 102.889MPa for a 1:1 cement-sand ratio with 40% polymer. Bonding strength was highest for a 40:60 polymer-mortar ratio. The results indicate that proportional mixing of cement, sand and polymer play a key role in the composite's strength properties for use in concrete repairs.
A digital footprint is a record of everything posted online about an individual, including what they post themselves, photos others post of them, and information posted by friends and family. This footprint can impact things like job applications, school admission, and what people think of an individual, both positively and negatively. To manage their footprint, people should be careful about what they post online, the websites they join, forms they fill out, who they give information to, and photos they post.
This document discusses a proposed system for allowing IT professionals to remotely access their office computers from mobile phones using cellular technology. It describes how the system would work, including using SHA-1 encryption to securely transmit data between the mobile device and PC. The system is intended to make remote desktop sharing more convenient and accessible for traveling IT professionals.
This document discusses innovative techniques for teaching English in the digital age. It covers several techniques including blogs, social networking, wikis, massively multiplayer online games, and mobile phone assisted language learning. The key techniques allow for collaboration outside the classroom, interaction and communication in English, project-based learning, and accessibility of language lessons through mobile devices. Overall, the document advocates for English teachers to adopt these new digital techniques to engage students and meet the demands of a changing era.
Area Efficient and high-speed fir filter implementation using divided LUT methodIJMER
Traditional method of implementing FIR filters costs considerable hardware resourses,
which goes against the decrease of circuit scale and the increase of system speed. A new design and
implementation of FIR filters using Distributed Arithmetic is provided in this paper to slove this
problem. Distributed Arithmetic structure is used to increase the resourse useage while pipeline
structure is also used to increase the system speed. In addition, the devided LUT method is also used to
decrease the required memory units. The simulation results indicate that FIR filters using Distributed
Arithmetic can work stable with high speed and can save almost 50 percent hardware resourses to
decrease the circuit scale, and can be applied to a variety of areas for its great flexibility and high
reliability
Recommendation generation by integrating sequential pattern mining and semanticseSAT Journals
Abstract As the Internet usage keeps increasing, the number of web sites and hence the number of web pages also keeps increasing. A recommendation system can be used to provide personalized web service by suggesting the pages that are likely to be accessed in future. Most of the recommendation systems are based on association rule mining or based on keywords. Using the association rule mining the prediction rate is less as it doesn’t take into account the order of access of the web pages by the users. The recommendation systems that are key-word based provides lesser relevant results. This paper proposes a recommendation system that uses the advantages of sequential pattern mining and semantics over the association rule mining and keyword based systems respectively. Keywords: Sequential Pattern Mining, Taxonomy, Apriori-All, CS-Mine, Semantic, Clustering
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
DESIGN AND IMPLEMENTATION OF CARPOOL DATA ACQUISITION PROGRAM BASED ON WEB CR...ijmech
Now the public traffics make the life more and more convenient. The amount of vehicles in large or medium
sized cities is also in the rapid growth. In order to take full advantage of social resources and protect the
environment, regional end-to-end public transport services are established by analyzing online travel data.
The usage of computer programs for processing of the web page is necessary for accessing to a large
number of the carpool data. In the paper, web crawlers are designed to capture the travel data from
several large service sites. In order to maximize the access to traffic data, a breadth-first algorithm is used.
The carpool data will be saved in a structured form. Additionally, the paper has provided a convenient
method of data collecting to the program.
A vague improved markov model approach for web page predictionIJCSES Journal
Today most of the information in all areas is available over the web. It increases the web utilization as
well as attracts the interest of researchers to improve the effectiveness of web access and web utilization.
As the number of web clients gets increased, the bandwidth sharing is performed that decreases the web
access efficiency. Web page prefetching improves the effectiveness of web access by availing the next
required web page before the user demand. It is an intelligent predictive mining that analyze the user web
access history and predict the next page. In this work, vague improved markov model is presented to
perform the prediction. In this work, vague rules are suggested to perform the pruning at different levels of
markov model. Once the prediction table is generated, the association mining will be implemented to
identify the most effective next page. In this paper, an integrated model is suggested to improve the
prediction accuracy and effectiveness.
Implemenation of Enhancing Information Retrieval Using Integration of Invisib...iosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
3 iaetsd semantic web page recommender systemIaetsd Iaetsd
This document discusses a semantic web-page recommender system that aims to improve upon traditional recommender systems. It proposes two novel knowledge representation models: 1) a semantic network that represents domain knowledge through terms, webpages, and relationships automatically constructed from website data, and 2) a conceptual prediction model that integrates semantic knowledge with web usage data to create a weighted semantic network for making recommendations. The system seeks to overcome limitations of prior systems by automating knowledge base construction and addressing the "new-item problem" through incorporation of semantic information. Evaluation shows the proposed approach yields better performance than existing web usage-based recommender systems.
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
This document describes a novel parallel domain focused crawler (PDFC) that aims to reduce the load on networks by crawling only web pages relevant to specific domains and skipping pages that have not been modified. The proposed system uses mobile crawlers that stay at remote sites to monitor page changes and only send compressed, modified pages back to the search engine. It is estimated that this approach can reduce network load by up to 40% by avoiding downloading of unchanged pages. The key components of PDFC include modules for URL allocation, comparing page changes over time, and estimating frequency of changes to prioritize recrawling. Experimental results suggest the system is effective at preserving network resources.
The Application Programming Interface restricts
the types of queries that the Web service can answer. For
instance, a Web service might provide a method that returns the
books of a given author in fast manner, but it might not provide a
method that returns the authors of a given book. If the user asks
for the author of some specific book, then the Web service cannot
be called – even though the underlying database might have the
preferred piece of information, this scenario is called asymmetry.
This asymmetry is particularly problematic if the service is used
in a Web service orchestration system. In this survey, we propose
to use on-the-fly information extraction (IE).IE used to collect
values, and then the value can be used as parameter Bindings for
the Web service. This survey shows how the information
extraction can be integrated into a Web service orchestration
system. The proposed approach is fully implemented in a
prototype called Search Using Services and Information
Extraction (SUSIE). Real-life data and services are used to
demonstrate the practical viability and good performance of our
approach
A Review on Pattern Discovery Techniques of Web Usage MiningIJERA Editor
In the recent years with the development of Internet technology the growth of World Wide Web exceeded all expectations. A lot of information is available in different formats and retrieving interesting content has become a very difficult task. One possible approach to solve this problem is Web Usage Mining (WUM), the important application of Web Mining. Extracting the hidden knowledge in the log files of a web server, recognizing various interests of web users, discovering customer behavior while at the site are normally referred as the applications of web usage mining. In this paper we provide an updated focused survey on techniques of web usage mining.
Identifying the Number of Visitors to improve Website Usability from Educatio...Editor IJCATR
Web usage mining deals with understanding the Visitor’s behaviour with a Website. It helps in understanding the concerns
such as present and future probability of every website user, relationship between behaviour and website usability. It has different
branches such as web content mining, web structure and web usage mining. The focus of this paper is on web mining usage patterns of
an educational institution web log data. There are three types of web related log data namely web access log, error log and proxy log
data. In this paper web access log data has been used as dataset because the web access log data is the typical source of navigational
behaviour of the website visitor. The study of web server log analysis is helpful in applying the web mining techniques.
International Journal of Engineering Research and DevelopmentIJERD Editor
Electrical, Electronics and Computer Engineering,
Information Engineering and Technology,
Mechanical, Industrial and Manufacturing Engineering,
Automation and Mechatronics Engineering,
Material and Chemical Engineering,
Civil and Architecture Engineering,
Biotechnology and Bio Engineering,
Environmental Engineering,
Petroleum and Mining Engineering,
Marine and Agriculture engineering,
Aerospace Engineering.
Web Page Recommendation Using Web MiningIJERA Editor
On World Wide Web various kind of content are generated in huge amount, so to give relevant result to user web recommendation become important part of web application. On web different kind of web recommendation are made available to user every day that includes Image, Video, Audio, query suggestion and web page. In this paper we are aiming at providing framework for web page recommendation. 1) First we describe the basics of web mining, types of web mining. 2) Details of each web mining technique.3)We propose the architecture for the personalized web page recommendation.
This document proposes a new technique to enhance the learning capabilities and reduce the computation intensity of a competitive learning multi-layered neural network using the K-means clustering algorithm. The proposed model uses a multi-layered network architecture with backpropagation learning to analyze web log data. Data preprocessing steps like cleaning, user identification, and transaction identification are applied to prepare the enterprise proxy log data for analysis. The proposed framework aims to discover useful patterns from web log data through a combination of K-means clustering and a feedforward neural network.
WEB LOG PREPROCESSING BASED ON PARTIAL ANCESTRAL GRAPH TECHNIQUE FOR SESSION ...cscpconf
Web access log analysis is to analyze the patterns of web site usage and the features of users behavior. It is
the fact that the normal Log data is very noisy and unclear and it is vital to preprocess the log data for
efficient web usage mining process. Preprocessing comprises of three phases which includes data cleaning,
user identification and session construction. Session construction is very vital and numerous real world
problems can be modeled as traversals on graph and mining from these traversals would provide the
requirement for preprocessing phase. On the other hand, the traversals on unweighted graph have been
taken into consideration in existing works. This paper oversimplifies this to the case where vertices of
graph are given weights to reflect their significance. The proposed method constructs sessions as a Partial
Ancestral Graph which contains pages with calculated weights. This will help site administrators to find
the interesting pages for users and to redesign their web pages. After weighting each page according to
browsing time a PAG structure is constructed for each user session. Existing system in which there is a
problem of learning with the latent variables of the data and the problem can be overcome by the proposed
method.
This document summarizes previous work on content extraction from web pages and proposes a new approach. It discusses existing methods that use techniques like entropy analysis, DOM trees, clustering, and ratios of text, links and tags. The proposed approach combines word to leaf ratio with text link ratio and link text ratio to identify informative nodes in the DOM tree. It calculates weights and relative positions of nodes to select the most informative content. The method will be tested on different website types and compared to existing approaches.
The document discusses optimizing search performance in unstructured peer-to-peer networks by improving the overlay topology. Existing approaches organize peers based on similarity but do not provide performance guarantees. The proposed approach presents a novel overlay construction algorithm that exploits the power-law file sharing pattern exhibited in P2P networks. It provides rigorous guarantees including searching in O(logN) hops with high probability and progressively exploiting peer similarity. Simulations show it outperforms competing algorithms in reducing query hops, improving success ratio, and minimizing messaging overhead.
Repairing of Concrete by Using Polymer-Mortar CompositesIJMER
This document discusses research into using polymer-mortar composites for concrete repair. Two sets of mixtures were prepared with different cement-sand ratios and percentages of epoxy polymer added. Tests were conducted to determine the composites' compressive, flexural, and bonding strengths. The highest compressive strength was 102.889MPa for a 1:1 cement-sand ratio with 40% polymer. Bonding strength was highest for a 40:60 polymer-mortar ratio. The results indicate that proportional mixing of cement, sand and polymer play a key role in the composite's strength properties for use in concrete repairs.
A digital footprint is a record of everything posted online about an individual, including what they post themselves, photos others post of them, and information posted by friends and family. This footprint can impact things like job applications, school admission, and what people think of an individual, both positively and negatively. To manage their footprint, people should be careful about what they post online, the websites they join, forms they fill out, who they give information to, and photos they post.
This document discusses a proposed system for allowing IT professionals to remotely access their office computers from mobile phones using cellular technology. It describes how the system would work, including using SHA-1 encryption to securely transmit data between the mobile device and PC. The system is intended to make remote desktop sharing more convenient and accessible for traveling IT professionals.
This document discusses innovative techniques for teaching English in the digital age. It covers several techniques including blogs, social networking, wikis, massively multiplayer online games, and mobile phone assisted language learning. The key techniques allow for collaboration outside the classroom, interaction and communication in English, project-based learning, and accessibility of language lessons through mobile devices. Overall, the document advocates for English teachers to adopt these new digital techniques to engage students and meet the demands of a changing era.
Area Efficient and high-speed fir filter implementation using divided LUT methodIJMER
Traditional method of implementing FIR filters costs considerable hardware resourses,
which goes against the decrease of circuit scale and the increase of system speed. A new design and
implementation of FIR filters using Distributed Arithmetic is provided in this paper to slove this
problem. Distributed Arithmetic structure is used to increase the resourse useage while pipeline
structure is also used to increase the system speed. In addition, the devided LUT method is also used to
decrease the required memory units. The simulation results indicate that FIR filters using Distributed
Arithmetic can work stable with high speed and can save almost 50 percent hardware resourses to
decrease the circuit scale, and can be applied to a variety of areas for its great flexibility and high
reliability
This document discusses generalized closed sets in topological spaces. It begins by introducing several types of generalized closed sets that have been defined in previous literature, such as g-closed sets, sg-closed sets, gs-closed sets, etc. It then defines a new type of generalized closed set called a g*s-closed set, which is a subset A such that the semi-closure of A is contained in every g-open set containing A. Examples are provided to illustrate g*s-closed sets. Properties of g*s-closed sets are discussed, such as every semi-closed set being g*s-closed, but the converse is not true. The relationship between g*s-closed sets and other
Removal of Clutter by Using Wavelet Transform For Wind ProfilerIJMER
ABSTRACT: Removal of clutter in the radar wind profiler
is the utmost important consideration in radar. Wavelet
transform is very effective method to remove the clutter.
This paper presents a technique based on the wavelet
transform to remove the clutter. In this technique we used
Fourier transform and descrete wavelet transform after
that applied inverse discrete wavelet transform for signal.
These techniques applied for inphase and quadrature
phase, total spectrum and single range gate. Very
encouraging results got with this technique that have
shown practical possibilities for a real time
implementation and for applications related to frequency
domain.
Keywords: Wind profiler, wavelet transform, Fourier transform, clutter, signal processing
This document summarizes a research paper about using a Z-Source Inverter (ZSI) based STATCOM to enhance power quality in a thirty bus power system. It first provides background on power quality issues and how Flexible AC Transmission Systems (FACTS) devices like STATCOMs can help address them. It then describes the components and operation of a conventional STATCOM and introduces the ZSI as an alternative inverter topology. The research presented in the paper models and simulates a thirty bus system both with and without a ZSI-based STATCOM to study improvements in voltage regulation and reactive power compensation.
Modified Procedure for Construction and Selection of Sampling Plans for Vari...IJMER
Linear Trend is Technique to generate the values for observerd frequency distribution and it
will give the accuracy of the smoothing obtained depends on the number of available data sets. In this
article ,an attempt was made to estimate the modified liner trend value generator for construction and
selection of sampling plans for variable inspection scheme indexed through the MAAOQ over the Liner
Trend .We compare the constructed sampling plans indexed through MAAOQ over Linear Trend with
the basic sampling plans indexed with AOQL. And also obtain the performance of the operating
characteristic curves.
The document discusses programmed instruction, which is a systematic, self-paced method of instruction designed to ensure learning. It breaks content into small steps with built-in feedback. There are different types, including linear, branched, and mathetics programming. Programmed instruction aims to place the learner at the center and allow them to construct knowledge through active participation, as opposed to passive absorption of information. While it shows promise, programmed instruction has seen limited application in Indian classrooms.
This document discusses various techniques for sentiment analysis of application reviews, including both statistical and natural language processing approaches. It describes how sentiment analysis can be used to analyze textual reviews and classify them as positive or negative. Several key techniques are discussed, such as using machine learning classifiers like Naive Bayes, extracting n-grams and sentiment-oriented words, and developing rule-based models using techniques like identifying parts of speech. The document also discusses using these techniques to perform sentiment analysis at both the document and aspect levels.
The document discusses the key elements of the everlasting gospel and the three phases of the first angel's message from Revelation 14:6-7. It explains that the everlasting gospel highlights salvation through Jesus alone, and must be brought back to its pure form. It also summarizes that the first angel's message includes: 1) a worldwide proclamation of the pure gospel, 2) a warning that the judgment hour has begun, and 3) a call to worship the Creator of the heavens, earth, and sea - pointing to the seventh-day Sabbath. The pure gospel and this three-phase message are described as restoring true Christianity and preparing people for Christ's return.
A Survey of User Authentication Schemes for Mobile DeviceIJMER
International Journal of Modern Engineering Research (IJMER) is Peer reviewed, online Journal. It serves as an international archival forum of scholarly research related to engineering and science education.
Report Card Night is an event where teachers meet with parents for 3 minute "mini conferences" over a 90 minute period in the gym to discuss student progress and grades. Teachers are seated alphabetically at tables with students' grades and laptops. Parents pick up report cards, choose which teachers to meet with, and visit them directly for feedback. This format provides more meaningful parent-teacher interaction than traditional open houses. Benefits include immediate feedback, no lost report cards, and parents can target specific teachers. Potential issues are ensuring enough space and technology for conferences within the 90 minute time period.
This document describes the design of an air conditioning system for a Volvo bus. It discusses the components of an automotive air conditioning system including a compressor, condenser, expansion device, evaporator, and accumulator. The document then provides details of the experimental setup used, which models these components in a closed refrigeration loop using R-134a refrigerant. Calculations are shown to determine the cooling load required for the bus based on parameters like occupancy, glass area, temperature conditions, and heat sources. The cooling load is calculated to be 1.13 kW, requiring a system with 8.25 tons of cooling capacity.
A digital footprint can impact job applications, school admission, and people's perceptions by leaving an online record of what users post, share, join, and upload. The document advises being careful about personal information disclosed on websites, forms, and photos due to the permanence of content online. It defines a digital footprint as all internet data related to an individual, including their online activities and digital content posted by others about them.
Structural Monitoring Of Buildings Using Wireless Sensor NetworksIJMER
1) The document discusses using wireless sensor networks and smart sensors to monitor buildings and structures for risks from natural hazards like earthquakes as well as man-made risks like fires.
2) It specifically examines using the Berkeley Mote and MICAz OEM Edition Mote wireless sensor platforms to monitor factors like structural performance, damage, gas leaks, and fires in buildings.
3) The MICAz Mote is highlighted as a potential wireless sensor that could be deployed to monitor building performance and check for risks, though more research is needed on effective communication modes for a full wireless sensor network for building risk monitoring.
Aiden grew a plant from seeds over several weeks, carefully taking care of and observing the sprouts as they emerged from the soil and continued growing. The sprouts received water and sunlight as they developed photosynthesis, and after six weeks the plant was successfully transferred to a new pot, concluding Aiden's project of growing a plant from seeds.
User Navigation Pattern Prediction from Web Log Data: A SurveyIJMER
This paper proposes a survey of Web Page Prediction Techniques. Prefetching of Web page
has been widely used to reduce the access latency problem of the Web users. However, if Prefetching of
Web page is not accurate and Prefetched web pages are not visited by the users in their accesses, the
limited bandwidth of network and services of server will not be used efficiently and may face the problem
of access delay. Therefore, it is critical that we need an effective prediction method during prefetching.
The Markov models have been widely used to predict and analyze users navigational behavior. All the
activities of web users have been saved in web log files. The stored users session is used to extract
popular web navigation paths and predict current users next web page visit
Advance Clustering Technique Based on Markov Chain for Predicting Next User M...idescitation
According to the survey India is one of the
leading countries in the word for technical education and
management education. Numbers of students are increasing
day by day by the growth rate of 45% per annum. Advancement
in technology puts special effect on education system. This
helps in upgrading higher education. Some universities and
colleges are using these technologies. Weblog is one of them.
Main aim of this paper is to represent web logs using clustering
technique for predicting next user movement and user
behavior analysis. This paper moves around the web log
clustering technique based on Markov chain results .In this
paper we present an ideal approach to web clustering
(clustering web site users) and predicting their behavior for
next visit. Methodology: For generating effective result approx
14 engineering college web usage data is used and an advance
clustering approach is presenting after optimizing the other
clustering approach.Results: The user behavior is predicted
with the help of the advance clustering approach based on the
FPCM and k-mean. Proposed algorithm is used to mined and
predict user’s preferred paths. To predict the user behavior
existing approaches have been used. But the existing
approaches are not enough because of its reaction towards
noise. Thus with the help of ACM, noise is reduced, provides
more accurate result for predicting the user behavior. Approach
Implementation:The algorithm was implemented in MAT
LAB, DTRG and in Java .The experiment result proves that
this method is very effective in predicting user behavior. The
experimental results have validated the method’s effectiveness
in comparison with some previous studies.
Integrating Vague Association Mining with Markov Model ijsc
The increasing demand of World Wide Web raises the need of predicting the user’s web page request. The most widely used approach to predict the web pages is the pattern discovery process of Web usage mining. This process involves inevitability of many techniques like Markov model, association rules and clustering. Fuzzy theory with different techniques has been introduced for the better results. Our focus is on Markov models. This paper is introducing the vague Rules with Markov models for more accuracy using the vague set theory.
World Wide Web is a huge repository of information and there is a tremendous increase in the volume of
information daily. The number of users are also increasing day by day. To reduce users browsing time lot
of research is taken place. Web Usage Mining is a type of web mining in which mining techniques are
applied in log data to extract the behaviour of users. Clustering plays an important role in a broad range
of applications like Web analysis, CRM, marketing, medical diagnostics, computational biology, and many
others. Clustering is the grouping of similar instances or objects. The key factor for clustering is some sort
of measure that can determine whether two objects are similar or dissimilar. In this paper a novel
clustering method to partition user sessions into accurate clusters is discussed. The accuracy and various
performance measures of the proposed algorithm shows that the proposed method is a better method for
web log mining.
MULTIFACTOR NAÏVE BAYES CLASSIFICATION FOR THE SLOW LEARNER PREDICTION OVER M...ijcsa
The high school students must be observed for their slow learning or quick learning abilities to provide
them with the best education practices. Such analysis can be perfectly performed over the student
performance data. The high school student data has been obtained from the schools from the various
regions in Punjab, a pivotal state of India. The complete student data and the selective data of almost 1300
students obtained from one school in the regions has been undergone the test using the proposed model in
this paper. The proposed model is based upon the naïve bayes classification model for the data
classification using the multi-factor features obtained from the input dataset. The subject groups have been
divided into the two primary groups: difficult and normal. The classification algorithm has been applied
individually over data grouped in the various subject groups. Both of the early stage classification events
have produced the almost similar results, whereas the results obtained from the classification events over
the averaging factors and the floating factors told the different story than the early stage classification. The
proposed model results have shown that the deep analysis of the data tells the in-depth facts from the input
data. The proposed model can be considered as the effectiv
MULTIFACTOR NAÏVE BAYES CLASSIFICATION FOR THE SLOW LEARNER PREDICTION OVER M...ijcsa
The high school students must be observed for their slow learning or quick learning abilities to provide
them with the best education practices. Such analysis can be perfectly performed over the student
performance data. The high school student data has been obtained from the schools from the various
regions in Punjab, a pivotal state of India. The complete student data and the selective data of almost 1300
students obtained from one school in the regions has been undergone the test using the proposed model in
this paper. The proposed model is based upon the naïve bayes classification model for the data
classification using the multi-factor features obtained from the input dataset. The subject groups have been
divided into the two primary groups: difficult and normal. The classification algorithm has been applied
individually over data grouped in the various subject groups. Both of the early stage classification events
have produced the almost similar results, whereas the results obtained from the classification events over
the averaging factors and the floating factors told the different story than the early stage classification. The
proposed model results have shown that the deep analysis of the data tells the in-depth facts from the input
data. The proposed model can be considered as the effective classification model when evaluated from the
results described in the earlier sections.
An effective search on web log from most popular downloaded contentijdpsjournal
A Web page recommender system effectively predicts the best related web page to search. While search
ing
a word from search engine it may display some unnecessary links and unrelated data’s to user so to a
void
this problem, the con
ceptual prediction model combines both the web usage and domain knowledge. The
proposed conceptual prediction model automatically generates a semantic network of the semantic Web
usage knowledge, which is the integration of domain knowledge and web usage i
nformation. Web usage
mining aims to discover interesting and frequent user access patterns from web browsing data. The
discovered knowledge can then be used for many practical web applications such as web
recommendations, adaptive web sites, and personali
zed web search and surfing
IRJET- Enhancing Prediction of User Behavior on the Basic of Web LogsIRJET Journal
The document discusses predicting user behavior based on web logs. It proposes using several algorithms to analyze web log data, including Apriori, KNN, FP-Growth, and an Improved Parallel FP-Growth algorithm. The algorithms are applied to preprocessed web log data to identify frequent patterns and items that provide insights into user behavior. Experimental results show the Improved Parallel FP-Growth algorithm provides higher mining efficiency and can handle large, growing datasets.
Methodologies on user Behavior Analysis and Future Request Prediction in Web ...ijbuiiir1
Web Usage Mining is a kind of web mining which provides knowledge about user navigation behavior and gets the interesting patterns from web. Web usage mining refers to the mechanical invention and scrutiny of patterns in click stream and linked data treated as a consequence of user interactions with web resources on one or more web sites. Identify the need and interest of the user and its useful for upgrade web Sources. Web site developers they can update their web site according to their attention. In this paper discuss about the different types of Methodologies which has been carried out in previous research work for Discovering User Behavior and Predicting the Future Request.
This document discusses methods for compressing mouse cursor activity data collected by websites for analytics purposes. It evaluates 10 compression algorithms, including 5 lossless and 5 lossy methods. The results show that different algorithms are suited to different goals, such as reducing bandwidth, improving client-side performance, or accurately replicating the original cursor data. Lossy algorithms like piecewise linear interpolation and distance-thresholding offered better performance and bandwidth reduction than lossless LZW compression. The study contributes to making mouse cursor tracking a practical technology by reducing the data size.
COST-SENSITIVE TOPICAL DATA ACQUISITION FROM THE WEBIJDKP
The cost of acquiring training data instances for induction of data mining models is one of the main concerns in real-world problems. The web is a comprehensive source for many types of data which can be used for data mining tasks. But the distributed and dynamic nature of web dictates the use of solutions which can handle these characteristics. In this paper, we introduce an automatic method for topical data acquisition from the web. We propose a new type of topical crawlers that use a hybrid link context extraction method for topical crawling to acquire on-topic web pages with minimum bandwidth usage and with the lowest cost. The new link context extraction method which is called Block Text Window (BTW), combines a text window method with a block-based method and overcomes challenges of each of these methods using the advantages of the other one. Experimental results show the predominance of BTW in comparison with state of the art automatic topical web data acquisition methods based on standard metrics.
AN EXTENSIVE LITERATURE SURVEY ON COMPREHENSIVE RESEARCH ACTIVITIES OF WEB US...James Heller
This document summarizes an extensive literature review on web usage mining. It begins by defining web mining and its three types: web content mining, web structure mining, and web usage mining. It then describes the main stages of web usage mining in detail: pre-processing, storage models, pattern discovery, optimization techniques, and pattern analysis. Finally, it discusses the characteristics of web server log data, including that it is unstructured, heterogeneous, distributed, contains different data types and dynamic content, and is voluminous, non-scalable, and incremental in nature.
A detail survey of page re ranking various web features and techniquesijctet
This document discusses techniques for page re-ranking on websites based on user behavior analysis. It describes how web usage mining involves analyzing web server logs to extract patterns in user behavior. Common techniques discussed for page re-ranking include Markov models, data mining approaches like clustering and association rule mining, and analyzing linked web page structures. The goal is to better understand user interests and predict future page access to improve information retrieval and optimize website design.
Research on classification algorithms and its impact on web miningIAEME Publication
This document summarizes research on web mining classification algorithms. It discusses how web mining can be classified into three types: web structure mining, web content mining, and web usage mining. The paper focuses on web usage mining, which involves extracting patterns from log data to analyze user behavioral patterns. Various classification algorithms are explored, including clustering, classification, and associative rule mining. The document also reviews related work applying these algorithms to web usage mining and discusses the different phases involved in web usage classification, including user identification.
Majority of the computer or mobile phone enthusiasts make use of the web for searching
activity. Web search engines are used for the searching; The results that the search engines get
are provided to it by a software module known as the Web Crawler. The size of this web is
increasing round-the-clock. The principal problem is to search this huge database for specific
information. To state whether a web page is relevant to a search topic is a dilemma. This paper
proposes a crawler called as “PDD crawler” which will follow both a link based as well as a
content based approach. This crawler follows a completely new crawling strategy to compute
the relevance of the page. It analyses the content of the page based on the information contained
in various tags within the HTML source code and then computes the total weight of the page. The page with the highest weight, thus has the maximum content and highest relevance.
Data mining refers to analyzing data from different perspectives to extract useful information. There are various types of data mining including business, scientific, and internet data mining. Web mining is a main application of data mining that involves the automated discovery of useful information from web documents and services. It has three domains: web content mining which extracts patterns from online information, web structure mining which describes how content is organized, and web usage mining which analyzes web access logs to understand user behavior. Common web mining techniques include clustering, classification, association rules, path analysis, and sequential patterns. Web mining tools like Mozenda can routinely extract, store, and publish web data to be used in various applications.
Certain Issues in Web Page Prediction, Classification and Clustering in Data ...IJAEMSJORNAL
Nowadays, data mining which is a part of web mining plays a vital role in various applications such as search engines, health care centers for extracting the individual patient details among huge database, analyzing disease based on basic criteria, education system for analyzing their performance level with other system, social networking, E-Commerce and knowledge management etc., which extract the information based on the user query. The issues are time taken to mine the target content or webpage from the search engines, space complexity and predicting the frequent webpage for the next user based on users’ behaviour.
Web Usage Mining: A Survey on User's Navigation Pattern from Web Logsijsrd.com
With an expontial growth of World Wide Web, there are so many information overloaded and it became hard to find out data according to need. Web usage mining is a part of web mining, which deal with automatic discovery of user navigation pattern from web log. This paper presents an overview of web mining and also provide navigation pattern from classification and clustering algorithm for web usage mining. Web usage mining contain three important task namely data preprocessing, pattern discovery and pattern analysis based on discovered pattern. And also contain the comparative study of web mining techniques.
Enactment of Firefly Algorithm and Fuzzy C-Means Clustering For Consumer Requ...IRJET Journal
The document proposes a novel methodology for predicting consumer demand and future requests on web pages using a hybrid approach. It first classifies consumers as potential or non-potential using a firefly-based neural network with Levenberg-Marquardt algorithm. Potential consumer data is then clustered using an improved fuzzy C-means clustering algorithm. Finally, upcoming consumer demand is predicted by analyzing patterns and recommending web pages with higher weights. The proposed approach is implemented in Java and CloudSim and aims to overcome limitations of existing recommendation systems by providing more accurate and efficient predictions in shorter time.
Design and Implementation of Carpool Data Acquisition Program Based on Web Cr...ijmech
Now the public traffics make the life more and more convenient. The amount of vehicles in large or medium sized cities is also in the rapid growth. In order to take full advantage of social resources and protect the environment, regional end-to-end public transport services are established by analyzing online travel data. The usage of computer programs for processing of the web page is necessary for accessing to a large number of the carpool data. In the paper, web crawlers are designed to capture the travel data from several large service sites. In order to maximize the access to traffic data, a breadth-first algorithm is used. The carpool data will be saved in a structured form. Additionally, the paper has provided a convenient method of data collecting to the program.
Similar to User Navigation Pattern Prediction from Web Log Data: A Survey (20)
A Study on Translucent Concrete Product and Its Properties by Using Optical F...IJMER
- Translucent concrete is a concrete based material with light-transferring properties,
obtained due to embedded light optical elements like Optical fibers used in concrete. Light is conducted
through the concrete from one end to the other. This results into a certain light pattern on the other
surface, depending on the fiber structure. Optical fibers transmit light so effectively that there is
virtually no loss of light conducted through the fibers. This paper deals with the modeling of such
translucent or transparent concrete blocks and panel and their usage and also the advantages it brings
in the field. The main purpose is to use sunlight as a light source to reduce the power consumption of
illumination and to use the optical fiber to sense the stress of structures and also use this concrete as an
architectural purpose of the building
Developing Cost Effective Automation for Cotton Seed DelintingIJMER
A low cost automation system for removal of lint from cottonseed is to be designed and
developed. The setup consists of stainless steel drum with stirrer in which cottonseeds having lint is mixed
with concentrated sulphuric acid. So lint will get burn. This lint free cottonseed treated with lime water to
neutralize acidic nature. After water washing this cottonseeds are used for agriculter purpose
Study & Testing Of Bio-Composite Material Based On Munja FibreIJMER
The incorporation of natural fibres such as munja fiber composites has gained
increasing applications both in many areas of Engineering and Technology. The aim of this study is to
evaluate mechanical properties such as flexural and tensile properties of reinforced epoxy composites.
This is mainly due to their applicable benefits as they are light weight and offer low cost compared to
synthetic fibre composites. Munja fibres recently have been a substitute material in many weight-critical
applications in areas such as aerospace, automotive and other high demanding industrial sectors. In
this study, natural munja fibre composites and munja/fibreglass hybrid composites were fabricated by a
combination of hand lay-up and cold-press methods. A new variety in munja fibre is the present work
the main aim of the work is to extract the neat fibre and is characterized for its flexural characteristics.
The composites are fabricated by reinforcing untreated and treated fibre and are tested for their
mechanical, properties strictly as per ASTM procedures.
Hybrid Engine (Stirling Engine + IC Engine + Electric Motor)IJMER
Hybrid engine is a combination of Stirling engine, IC engine and Electric motor. All these 3 are
connected together to a single shaft. The power source of the Stirling engine will be a Solar Panel. The aim of
this is to run the automobile using a Hybrid engine
Fabrication & Characterization of Bio Composite Materials Based On Sunnhemp F...IJMER
This document summarizes research on the fabrication and characterization of bio-composite materials using sunnhemp fibre. The document discusses how sunnhemp fibre was used to reinforce an epoxy matrix through hand lay-up methods. Various mechanical properties of the bio-composites were tested, including tensile, flexural, and impact properties. The results of the mechanical tests on the bio-composite specimens are presented. Potential applications of the sunnhemp fibre bio-composites are also suggested, such as in fall ceilings, partitions, packaging, automotive interiors, and toys.
Geochemistry and Genesis of Kammatturu Iron Ores of Devagiri Formation, Sandu...IJMER
The Greenstone belts of Karnataka are enriched in BIFs in Dharwar craton, where Iron
formations are confined to the basin shelf, clearly separated from the deeper-water iron formation that
accumulated at the basin margin and flanking the marine basin. Geochemical data procured in terms of
major, trace and REE are plotted in various diagrams to interpret the genesis of BIFs. Al2O3, Fe2O3 (T),
TiO2, CaO, and SiO2 abundances and ratios show a wide variation. Ni, Co, Zr, Sc, V, Rb, Sr, U, Th,
ΣREE, La, Ce and Eu anomalies and their binary relationships indicate that wherever the terrigenous
component has increased, the concentration of elements of felsic such as Zr and Hf has gone up. Elevated
concentrations of Ni, Co and Sc are contributed by chlorite and other components characteristic of basic
volcanic debris. The data suggest that these formations were generated by chemical and clastic
sedimentary processes on a shallow shelf. During transgression, chemical precipitation took place at the
sediment-water interface, whereas at the time of regression. Iron ore formed with sedimentary structures
and textures in Kammatturu area, in a setting where the water column was oxygenated.
Experimental Investigation on Characteristic Study of the Carbon Steel C45 in...IJMER
In this paper, the mechanical characteristics of C45 medium carbon steel are investigated
under various working conditions. The main characteristic to be studied on this paper is impact toughness
of the material with different configurations and the experiment were carried out on charpy impact testing
equipment. This study reveals the ability of the material to absorb energy up to failure for various
specimen configurations under different heat treated conditions and the corresponding results were
compared with the analysis outcome
Non linear analysis of Robot Gun Support Structure using Equivalent Dynamic A...IJMER
Robot guns are being increasingly employed in automotive manufacturing to replace
risky jobs and also to increase productivity. Using a single robot for a single operation proves to be
expensive. Hence for cost optimization, multiple guns are mounted on a single robot and multiple
operations are performed. Robot Gun structure is an efficient way in which multiple welds can be done
simultaneously. However mounting several weld guns on a single structure induces a variety of
dynamic loads, especially during movement of the robot arm as it maneuvers to reach the weld
locations. The primary idea employed in this paper, is to model those dynamic loads as equivalent G
force loads in FEA. This approach will be on the conservative side, and will be saving time and
subsequently cost efficient. The approach of the paper is towards creating a standard operating
procedure when it comes to analysis of such structures, with emphasis on deploying various technical
aspects of FEA such as Non Linear Geometry, Multipoint Constraint Contact Algorithm, Multizone
meshing .
Static Analysis of Go-Kart Chassis by Analytical and Solid Works SimulationIJMER
This paper aims to do modelling, simulation and performing the static analysis of a go
kart chassis consisting of Circular beams. Modelling, simulations and analysis are performed using 3-D
modelling software i.e. Solid Works and ANSYS according to the rulebook provided by Indian Society of
New Era Engineers (ISNEE) for National Go Kart Championship (NGKC-14).The maximum deflection is
determined by performing static analysis. Computed results are then compared to analytical calculation,
where it is found that the location of maximum deflection agrees well with theoretical approximation but
varies on magnitude aspect.
In récent year various vehicle introduced in market but due to limitation in
carbon émission and BS Séries limitd speed availability vehicle in the market and causing of
environnent pollution over few year There is need to decrease dependancy on fuel vehicle.
bicycle is to be modified for optional in the future To implement new technique using change in
pedal assembly and variable speed gearbox such as planetary gear optimise speed of vehicle
with variable speed ratio.To increase the efficiency of bicycle for confortable drive and to
reduce torque appli éd on bicycle. we introduced epicyclic gear box in which transmission done
throgh Chain Drive (i.e. Sprocket )to rear wheel with help of Epicyclical gear Box to give
number of différent Speed during driving.To reduce torque requirent in the cycle with change in
the pedal mechanism
Integration of Struts & Spring & Hibernate for Enterprise ApplicationsIJMER
This document discusses integrating the Spring, Struts, and Hibernate frameworks to develop enterprise applications. It provides an overview of each framework and their features. The Spring Framework is a lightweight, modular framework that allows for inversion of control and aspect-oriented programming. It can be used to develop any or all tiers of an application. The document proposes an architecture for an e-commerce website that integrates these three frameworks, with Spring handling the business layer, Struts the presentation layer, and Hibernate the data access layer. This modular approach allows for clear separation of concerns and reduces complexity in application development.
Microcontroller Based Automatic Sprinkler Irrigation SystemIJMER
Microcontroller based Automatic Sprinkler System is a new concept of using
intelligence power of embedded technology in the sprinkler irrigation work. Designed system replaces
the conventional manual work involved in sprinkler irrigation to automatic process. Using this system a
farmer is protected against adverse inhuman weather conditions, tedious work of changing over of
sprinkler water pipe lines & risk of accident due to high pressure in the water pipe line. Overall
sprinkler irrigation work is transformed in to a comfortableautomatic work. This system provides
flexibility & accuracy in respect of time set for the operation of a sprinkler water pipe lines. In present
work the author has designed and developed an automatic sprinkler irrigation system which is
controlled and monitored by a microcontroller interfaced with solenoid valves.
On some locally closed sets and spaces in Ideal Topological SpacesIJMER
This document introduces and studies the concept of δˆ s-locally closed sets in ideal topological spaces. Some key points:
- A subset A is δˆ s-locally closed if A can be written as the intersection of a δˆ s-open set and a δˆ s-closed set.
- Various properties of δˆ s-locally closed sets are introduced and characterized, including relationships to other concepts like generalized locally closed sets.
- It is shown that a subset A is δˆ s-locally closed if and only if A can be written as the intersection of a δˆ s-open set and the δˆ s-closure of A.
- Theore
Intrusion Detection and Forensics based on decision tree and Association rule...IJMER
This paper present an approach based on the combination of, two techniques using
decision tree and Association rule mining for Probe attack detection. This approach proves to be
better than the traditional approach of generating rules for fuzzy expert system by clustering methods.
Association rule mining for selecting the best attributes together and decision tree for identifying the
best parameters together to create the rules for fuzzy expert system. After that rules for fuzzy expert
system are generated using association rule mining and decision trees. Decision trees is generated for
dataset and to find the basic parameters for creating the membership functions of fuzzy inference
system. Membership functions are generated for the probe attack. Based on these rules we have
created the fuzzy inference system that is used as an input to neuro-fuzzy system. Fuzzy inference
system is loaded to neuro-fuzzy toolbox as an input and the final ANFIS structure is generated for
outcome of neuro-fuzzy approach. The experiments and evaluations of the proposed method were
done with NSL-KDD intrusion detection dataset. As the experimental results, the proposed approach
based on the combination of, two techniques using decision tree and Association rule mining
efficiently detected probe attacks. Experimental results shows better results for detecting intrusions as
compared to others existing methods
Natural Language Ambiguity and its Effect on Machine LearningIJMER
This document discusses natural language ambiguity and its effect on machine learning. It begins by introducing different types of ambiguity that exist in natural languages, including lexical, syntactic, semantic, discourse, and pragmatic ambiguities. It then examines how these ambiguities present challenges for computational linguistics and machine translation systems. Specifically, it notes that ambiguity is a major problem for computers in processing human language as they lack the world knowledge and context that humans use to resolve ambiguities. The document concludes by outlining the typical process of machine translation and how ambiguities can interfere with tasks like analysis, transfer, and generation of text in the target language.
Today in era of software industry there is no perfect software framework available for
analysis and software development. Currently there are enormous number of software development
process exists which can be implemented to stabilize the process of developing a software system. But no
perfect system is recognized till yet which can help software developers for opting of best software
development process. This paper present the framework of skillful system combined with Likert scale. With
the help of Likert scale we define a rule based model and delegate some mass score to every process and
develop one tool name as MuxSet which will help the software developers to select an appropriate
development process that may enhance the probability of system success.
Material Parameter and Effect of Thermal Load on Functionally Graded CylindersIJMER
The present study investigates the creep in a thick-walled composite cylinders made
up of aluminum/aluminum alloy matrix and reinforced with silicon carbide particles. The distribution
of SiCp is assumed to be either uniform or decreasing linearly from the inner to the outer radius of
the cylinder. The creep behavior of the cylinder has been described by threshold stress based creep
law with a stress exponent of 5. The composite cylinders are subjected to internal pressure which is
applied gradually and steady state condition of stress is assumed. The creep parameters required to
be used in creep law, are extracted by conducting regression analysis on the available experimental
results. The mathematical models have been developed to describe steady state creep in the composite
cylinder by using von-Mises criterion. Regression analysis is used to obtain the creep parameters
required in the study. The basic equilibrium equation of the cylinder and other constitutive equations
have been solved to obtain creep stresses in the cylinder. The effect of varying particle size, particle
content and temperature on the stresses in the composite cylinder has been analyzed. The study
revealed that the stress distributions in the cylinder do not vary significantly for various combinations
of particle size, particle content and operating temperature except for slight variation observed for
varying particle content. Functionally Graded Materials (FGMs) emerged and led to the development
of superior heat resistant materials.
Energy Audit is the systematic process for finding out the energy conservation
opportunities in industrial processes. The project carried out studies on various energy conservation
measures application in areas like lighting, motors, compressors, transformer, ventilation system etc.
In this investigation, studied the technical aspects of the various measures along with its cost benefit
analysis.
Investigation found that major areas of energy conservation are-
1. Energy efficient lighting schemes.
2. Use of electronic ballast instead of copper ballast.
3. Use of wind ventilators for ventilation.
4. Use of VFD for compressor.
5. Transparent roofing sheets to reduce energy consumption.
So Energy Audit is the only perfect & analyzed way of meeting the Industrial Energy Conservation.
An Implementation of I2C Slave Interface using Verilog HDLIJMER
This document describes the implementation of an I2C slave interface using Verilog HDL. It introduces the I2C protocol which uses only two bidirectional lines (SDA and SCL) for communication. The document discusses the I2C protocol specifications including start/stop conditions, addressing, read/write operations, and acknowledgements. It then provides details on designing an I2C slave module in Verilog that responds to commands from an I2C master and allows synchronization through clock stretching. The module is simulated in ModelSim and synthesized in Xilinx. Simulation waveforms demonstrate successful read and write operations to the slave device.
Discrete Model of Two Predators competing for One PreyIJMER
This paper investigates the dynamical behavior of a discrete model of one prey two
predator systems. The equilibrium points and their stability are analyzed. Time series plots are obtained
for different sets of parameter values. Also bifurcation diagrams are plotted to show dynamical behavior
of the system in selected range of growth parameter
Embedded machine learning-based road conditions and driving behavior monitoringIJECEIAES
Car accident rates have increased in recent years, resulting in losses in human lives, properties, and other financial costs. An embedded machine learning-based system is developed to address this critical issue. The system can monitor road conditions, detect driving patterns, and identify aggressive driving behaviors. The system is based on neural networks trained on a comprehensive dataset of driving events, driving styles, and road conditions. The system effectively detects potential risks and helps mitigate the frequency and impact of accidents. The primary goal is to ensure the safety of drivers and vehicles. Collecting data involved gathering information on three key road events: normal street and normal drive, speed bumps, circular yellow speed bumps, and three aggressive driving actions: sudden start, sudden stop, and sudden entry. The gathered data is processed and analyzed using a machine learning system designed for limited power and memory devices. The developed system resulted in 91.9% accuracy, 93.6% precision, and 92% recall. The achieved inference time on an Arduino Nano 33 BLE Sense with a 32-bit CPU running at 64 MHz is 34 ms and requires 2.6 kB peak RAM and 139.9 kB program flash memory, making it suitable for resource-constrained embedded systems.
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMSIJNSA Journal
The smart irrigation system represents an innovative approach to optimize water usage in agricultural and landscaping practices. The integration of cutting-edge technologies, including sensors, actuators, and data analysis, empowers this system to provide accurate monitoring and control of irrigation processes by leveraging real-time environmental conditions. The main objective of a smart irrigation system is to optimize water efficiency, minimize expenses, and foster the adoption of sustainable water management methods. This paper conducts a systematic risk assessment by exploring the key components/assets and their functionalities in the smart irrigation system. The crucial role of sensors in gathering data on soil moisture, weather patterns, and plant well-being is emphasized in this system. These sensors enable intelligent decision-making in irrigation scheduling and water distribution, leading to enhanced water efficiency and sustainable water management practices. Actuators enable automated control of irrigation devices, ensuring precise and targeted water delivery to plants. Additionally, the paper addresses the potential threat and vulnerabilities associated with smart irrigation systems. It discusses limitations of the system, such as power constraints and computational capabilities, and calculates the potential security risks. The paper suggests possible risk treatment methods for effective secure system operation. In conclusion, the paper emphasizes the significant benefits of implementing smart irrigation systems, including improved water conservation, increased crop yield, and reduced environmental impact. Additionally, based on the security analysis conducted, the paper recommends the implementation of countermeasures and security approaches to address vulnerabilities and ensure the integrity and reliability of the system. By incorporating these measures, smart irrigation technology can revolutionize water management practices in agriculture, promoting sustainability, resource efficiency, and safeguarding against potential security threats.
International Conference on NLP, Artificial Intelligence, Machine Learning an...gerogepatton
International Conference on NLP, Artificial Intelligence, Machine Learning and Applications (NLAIM 2024) offers a premier global platform for exchanging insights and findings in the theory, methodology, and applications of NLP, Artificial Intelligence, Machine Learning, and their applications. The conference seeks substantial contributions across all key domains of NLP, Artificial Intelligence, Machine Learning, and their practical applications, aiming to foster both theoretical advancements and real-world implementations. With a focus on facilitating collaboration between researchers and practitioners from academia and industry, the conference serves as a nexus for sharing the latest developments in the field.
Literature Review Basics and Understanding Reference Management.pptxDr Ramhari Poudyal
Three-day training on academic research focuses on analytical tools at United Technical College, supported by the University Grant Commission, Nepal. 24-26 May 2024
Understanding Inductive Bias in Machine LearningSUTEJAS
This presentation explores the concept of inductive bias in machine learning. It explains how algorithms come with built-in assumptions and preferences that guide the learning process. You'll learn about the different types of inductive bias and how they can impact the performance and generalizability of machine learning models.
The presentation also covers the positive and negative aspects of inductive bias, along with strategies for mitigating potential drawbacks. We'll explore examples of how bias manifests in algorithms like neural networks and decision trees.
By understanding inductive bias, you can gain valuable insights into how machine learning models work and make informed decisions when building and deploying them.
Comparative analysis between traditional aquaponics and reconstructed aquapon...bijceesjournal
The aquaponic system of planting is a method that does not require soil usage. It is a method that only needs water, fish, lava rocks (a substitute for soil), and plants. Aquaponic systems are sustainable and environmentally friendly. Its use not only helps to plant in small spaces but also helps reduce artificial chemical use and minimizes excess water use, as aquaponics consumes 90% less water than soil-based gardening. The study applied a descriptive and experimental design to assess and compare conventional and reconstructed aquaponic methods for reproducing tomatoes. The researchers created an observation checklist to determine the significant factors of the study. The study aims to determine the significant difference between traditional aquaponics and reconstructed aquaponics systems propagating tomatoes in terms of height, weight, girth, and number of fruits. The reconstructed aquaponics system’s higher growth yield results in a much more nourished crop than the traditional aquaponics system. It is superior in its number of fruits, height, weight, and girth measurement. Moreover, the reconstructed aquaponics system is proven to eliminate all the hindrances present in the traditional aquaponics system, which are overcrowding of fish, algae growth, pest problems, contaminated water, and dead fish.
Batteries -Introduction – Types of Batteries – discharging and charging of battery - characteristics of battery –battery rating- various tests on battery- – Primary battery: silver button cell- Secondary battery :Ni-Cd battery-modern battery: lithium ion battery-maintenance of batteries-choices of batteries for electric vehicle applications.
Fuel Cells: Introduction- importance and classification of fuel cells - description, principle, components, applications of fuel cells: H2-O2 fuel cell, alkaline fuel cell, molten carbonate fuel cell and direct methanol fuel cells.
TIME DIVISION MULTIPLEXING TECHNIQUE FOR COMMUNICATION SYSTEMHODECEDSIET
Time Division Multiplexing (TDM) is a method of transmitting multiple signals over a single communication channel by dividing the signal into many segments, each having a very short duration of time. These time slots are then allocated to different data streams, allowing multiple signals to share the same transmission medium efficiently. TDM is widely used in telecommunications and data communication systems.
### How TDM Works
1. **Time Slots Allocation**: The core principle of TDM is to assign distinct time slots to each signal. During each time slot, the respective signal is transmitted, and then the process repeats cyclically. For example, if there are four signals to be transmitted, the TDM cycle will divide time into four slots, each assigned to one signal.
2. **Synchronization**: Synchronization is crucial in TDM systems to ensure that the signals are correctly aligned with their respective time slots. Both the transmitter and receiver must be synchronized to avoid any overlap or loss of data. This synchronization is typically maintained by a clock signal that ensures time slots are accurately aligned.
3. **Frame Structure**: TDM data is organized into frames, where each frame consists of a set of time slots. Each frame is repeated at regular intervals, ensuring continuous transmission of data streams. The frame structure helps in managing the data streams and maintaining the synchronization between the transmitter and receiver.
4. **Multiplexer and Demultiplexer**: At the transmitting end, a multiplexer combines multiple input signals into a single composite signal by assigning each signal to a specific time slot. At the receiving end, a demultiplexer separates the composite signal back into individual signals based on their respective time slots.
### Types of TDM
1. **Synchronous TDM**: In synchronous TDM, time slots are pre-assigned to each signal, regardless of whether the signal has data to transmit or not. This can lead to inefficiencies if some time slots remain empty due to the absence of data.
2. **Asynchronous TDM (or Statistical TDM)**: Asynchronous TDM addresses the inefficiencies of synchronous TDM by allocating time slots dynamically based on the presence of data. Time slots are assigned only when there is data to transmit, which optimizes the use of the communication channel.
### Applications of TDM
- **Telecommunications**: TDM is extensively used in telecommunication systems, such as in T1 and E1 lines, where multiple telephone calls are transmitted over a single line by assigning each call to a specific time slot.
- **Digital Audio and Video Broadcasting**: TDM is used in broadcasting systems to transmit multiple audio or video streams over a single channel, ensuring efficient use of bandwidth.
- **Computer Networks**: TDM is used in network protocols and systems to manage the transmission of data from multiple sources over a single network medium.
### Advantages of TDM
- **Efficient Use of Bandwidth**: TDM all
ACEP Magazine edition 4th launched on 05.06.2024Rahul
This document provides information about the third edition of the magazine "Sthapatya" published by the Association of Civil Engineers (Practicing) Aurangabad. It includes messages from current and past presidents of ACEP, memories and photos from past ACEP events, information on life time achievement awards given by ACEP, and a technical article on concrete maintenance, repairs and strengthening. The document highlights activities of ACEP and provides a technical educational article for members.
Using recycled concrete aggregates (RCA) for pavements is crucial to achieving sustainability. Implementing RCA for new pavement can minimize carbon footprint, conserve natural resources, reduce harmful emissions, and lower life cycle costs. Compared to natural aggregate (NA), RCA pavement has fewer comprehensive studies and sustainability assessments.
User Navigation Pattern Prediction from Web Log Data: A Survey
1. International
OPEN ACCESS Journal
Of Modern Engineering Research (IJMER)
| IJMER | ISSN: 2249–6645 | www.ijmer.com | Vol. 4 | Iss.10| Oct. 2014 | 37|
User Navigation Pattern Prediction from Web Log Data: A Survey Vaibhav S. Dhande1, Dinesh D. Puri2 1Department of Computer Engineering SSBT’s COET, Bambhori, Jalgaon (M.S.), India 2Department of Computer Engineering SSBT’s COET, Bambhori, Jalgaon (M.S.), India
I. INTRODUCTION
World Wide Web (WWW) is collection of data which can be accessed by Web Browser. The World Wide Web is just a subset of Internet. The WWW is conceptual but the Internet is physical aspect like cable, router, switch etc. The Internet is actual network of networks where all the information presents. The Hyper- Text Transfer Protocol (HTTP) and File Transfer Protocol (FTP) are the methods used to transfer Web pages over the Network. Hypertext is a text which contains an address of another file or data. Web mining research works with many areas such as data mining, text mining, Web retrieval and information retrieval. The classification depends on the aspects like the purpose and the data sources. Mining research concentrates on finding new information or knowledge in the data. On the basis of above information, Web mining can be divided into three different categories web usage mining, web content mining web structure mining [1] is the process of extracting useful information from Web documents. Content data is nothing but the collection of facts a Web page was designed to convey to the users. Web content mining is not only related but also different from data mining and text mining. It is also related to data mining because many data mining techniques can be applied in web content mining. Web usage mining [1] is the application of data mining techniques to discover usage patterns from Web data in order to understand and better serve needs of Web based applications. It consists of three phases preprocessing, pattern discovery and pattern analysis. Web servers, client applications and proxies can easily capture data about Web usage. Figure 1. types of web mining The goal of web structure mining [2] is to generate structural summary about the web page and website. A hyperlink is a structural component that connects the web page to a different location. The first type of web structure mining is extracting patterns from hyperlinks in the web.
Abstract: This paper proposes a survey of Web Page Prediction Techniques. Prefetching of Web page has been widely used to reduce the access latency problem of the Web users. However, if Prefetching of Web page is not accurate and Prefetched web pages are not visited by the users in their accesses, the limited bandwidth of network and services of server will not be used efficiently and may face the problem of access delay. Therefore, it is critical that we need an effective prediction method during prefetching. The Markov models have been widely used to predict and analyze users navigational behavior. All the activities of web users have been saved in web log files. The stored users session is used to extract popular web navigation paths and predict current users next web page visit.
Keywords: Clustering, Markov Model, N-Grams, User Sessions, Web Usage Mining
2. User Navigation Pattern Prediction from Web Log Data: A Survey
| IJMER | ISSN: 2249–6645 | www.ijmer.com | Vol. 4 | Iss.10| Oct. 2014 | 38|
II. MOTIVATION
With the growing popularity of the World Wide Web, A large number of users access web sites in all over the world. When user access a websites, large volumes of data such as addresses of users or URLs requested are gathered automatically by Web servers and collected in access log which is very important because many times user repeatedly access the same type of web pages and the record is maintained in log files. These series of accessed web pages can be considered as a web access pattern which is helpful to find out the user behavior. Through this information of user behavior, we can find out the accurate user next request prediction that can reduce the browsing time of web page thus save the time of the user and decrease the server load. In recent years, there has been a lot of research work done in the field of User Navigation Pattern Prediction in web usage mining. The motivation of this study is to know what research has been done on User Navigation Pattern Prediction from Web log data.
III. LITERATURE SURVEY
Recommendation systems are one of the early applications of Web prediction. Joa chims et al. [3] proposed the Web Watcher which is a path-based recommendation model based on ANN and reinforcement learning. The system contains some properties like (a) WebWatcher provides several types of assistance but most importantly highlights the interesting hyperlinks as it accompanies the user. (b) It learns from experience. (c) WebWatcher runs as a centralized server so that it can assist any Web user running any type of Web browser as well as combine training data from thousands of different users. Su et al. [4] have proposed the N-gram prediction model and applied the all-N-gram prediction model in which several N-grams are built and used in prediction. Basically N-gram is a collection of N visited web pages by user. Their work is aimed at showing that using simple n-gram models for n greater than two will result in significant gain in prediction accuracy while maintaining reasonable applicability. They proposed path-based model for web page prediction. Their path-based model is built on a web-server log file L. They consider L to be reprocessed into a collection of user sessions, in a way that each session is indexed by a unique user id and starting time. Each session is nothing but a sequence of requests where each request corresponds to a visit to a web page (an URL). The log L then consists of a set of sessions. Their algorithm builds an n-gram prediction model based on the occurrence frequency. Each sub-string of length n is n-gram. These sub-strings serve as the indices of a count table T. During its operation, algorithm scans through all sub-strings exactly once, recording occurrence frequencies of the next click immediately after the substring in all sessions. The request which is maximum occurred is used as the prediction for the sub-string. Levene and Loizou [5] computed the information gain from the navigation trail to construct a Markov chain model to analyse the user navigation pattern through the Web. Navigation through the web, colloquially known as "surfing", is one of the main activities of users during interaction with web. When users follow a navigation trail they often tend to get disoriented in terms of the goals of their original query and thus the discovery of typical user trails could be useful in providing navigation assistance. Herein they give a theoretical underpinning of user navigation in terms of the entropy of an underlying Markov chain modelling the web topology. They present a novel method for online incremental computation of the entropy and a large deviation result regarding the length of a trail to realise they said entropy. They provide an error analysis for our estimation of the entropy in terms of the divergence between the empirical and actual probabilities. They also provide an extension of our technique to higher order Markov chains by a suitable reduction of a higher-order Markov chain model to a first-order. M. Deshpande, G. Karypis [6], presented a class of Markov model-based prediction algorithms that are obtained by selectively eliminating a large fraction of the states of the All-Kth-Order Markov model. Their experiments on a variety of datasets have shown that the resulting Markov models have a very low state-space complexity and at the same time achieve substantially better accuracies than those obtained by the traditional algorithms. M. Awad and L. Khan [7] have successfully combined several effective prediction models along with domain knowledge exploitation to improve the prediction accuracy. However, the module endures expensive training and prediction overheads because of the large number of labels/classes involved in the WPP.
M. T. Hassan, K. N. Junejo, and A. Karim [8] presented Bayesian models for two things like learning and predicting key Web navigation patterns. Instead of modeling the general problem of Web navigation they focus on key navigation patterns that have practical value. Furthermore, instead of developing complex models they present intuitive probabilistic models for learning and prediction. The patterns that they consider are: short and long visit sessions, page categories which visited in first N positions, rank of page categories in first
3. User Navigation Pattern Prediction from Web Log Data: A Survey
| IJMER | ISSN: 2249–6645 | www.ijmer.com | Vol. 4 | Iss.10| Oct. 2014 | 39|
N positions, and the range of page views per page category.They learn and they predict these patterns under four settings corresponding to what is known about the visit sessions (user ID and/or timestamp). F.Khalil, J. Li, H. Wang [9] improved the Web page access prediction accuracy by integrating all three prediction models: Clustering, Markov model, and association rules according to certain constraints. Their model, IMAC, integrates the three models using lower order Markov model. Clustering is used to group homogeneous user sessions. Low order Markov models are built on clustered sessions. Association rules are used when Markov models could not make clear predictions. The integrated model has been demonstrated to be more accurate than all three models implemented individually, as well as, other integrated models. The integrated model has less state space complexity and is more accurate than a higher order Markov model. Bhawna Nigamand and Dr. Suresh Jain [10] proposed three different Prefetching and Caching schemes i.e. Prefetching only, Prefetching with Caching and Prefetching from caching. Dynamic Nested Markov model is used for predicting next accessed web page. The Experimental result shows that the Prefetching with caching scheme will give good results. By applying these schemes, users’ web access latency can be minimized and quality of service can be provided to the web user. Mamoun A. Awad and Issa Khalil [11] analysed and studied Markov model and all- Kth Markov model in Web prediction. They proposed a new modified Markov model to alleviate the issue of scalability in the number of paths. They have used standard benchmark data sets to analyse, compare, and demonstrate the effectiveness of our techniques using variations of Markov models and association rule mining.Their experiments show the effectiveness of modified Markov model in reducing the number of paths without compromising accuracy. Additionally, the results support their analysis conclusions that accuracy improves with higher orders of all Kth model. Poornalatha G, Prakash S Raghavendra [12]presented a paper to solve the problem of predicting the next page to be accessed by the user based on the mining of web server logs that maintains the information of users who accessed the web site. Prediction of next page to be visited by the user may be pre fetched by the browser which in turn reduces the latency for user. Thus analysing user’s past behavior to predict the future web pages to be navigated by the user is of great importance. Li Yue et al. [13] propose a DOM-Based Block Text Identification method to detect navigation page. This method should extract block-text segments from a web page. If the number of segments is too small or too large, then that web page is classified as a navigation page. This method is based on this observation: a common content page contains main content, and this main content is not divided into a lot blocks. So if a web page contains no block text or contains too many block texts, it is rather a navigation page than a content page. A.Anitha [14] proposed to integrate Markov model based sequential pattern mining and clustering. With the help of proposed approach approximately 12% of prediction accuracy increases compared to traditional Markov model. The main advantage of proposed hierarchical clustering approach is that every object must be candidate of only one cluster. The traditional Markov models have serious limitation which is, the low order Markov models have good coverage but they lack accuracy due to poor history and high order Markov models suffers from high state space complexity, because they use long browsing history, but high order. Markov models provides good prediction accuracy for that purpose in proposed approach combined the advantages of both Markov models and in order to improve the accuracy of prediction process, sequential mining used. We have summarized various method of web usage mining in next session. Mehrdad Jalali, Narwati Mustapha, Md. Nasir Sulaiman, Ali mamat [15] advanced their previous work and renamed there architecture as WebPUM. In WebPUM they proposed a novel formula for assigning weights of edges of undirected graph to classify the current user activity. They used Longest Common Subsequence algorithm to predict user near future movements and they conducted two main experiments for navigation pattern mining and in second experiment, prediction of the user next request has been performed and they found quality of clustering for user navigation pattern and the quality of recommendation for both CTI and MSNBC datasets improved. V. Sujatha, Punithavalli [16] proposed the Prediction of User navigation patterns using Clustering and Classification (PUCC) from web log data. In the first stage PUCC focuses on separating the potential users in web log data, and in the second t stage clustering process is used to group the potential users with similar interest while in third stage the results of classification and clustering is used to predict the user future requests. Figure (2) shows PUCC model. The first stage is the cleaning stage, in which the unwanted log entries were removed. In the second stage, the cookies were identified and removed. The result was then used to identify potential users. From the potential user, a graph partitioned clustering algorithm was used to discover the navigation pattern. An LCS classification algorithm was then used to predict future requests.
4. User Navigation Pattern Prediction from Web Log Data: A Survey
| IJMER | ISSN: 2249–6645 | www.ijmer.com | Vol. 4 | Iss.10| Oct. 2014 | 40|
Figure 2 PUCC Model
Trilok Nath Pandey, Ranjita Kumari Dash , Alaka Nanda Tripathy ,Barnali Sahu [17] proposed
IMC(Integrating Markov Model with Clustering) approach for user User Navigation Pattern Prediction. In this
paper author presented the improvement of markov model accuracy by grouping web sessions into clusters.
The web pages in the user sessions are first allocated into categories according to web services that are
functionally meaningful. After this the k-means clustering algorithm is implemented using the most
appropriate number of clusters and distance measure. Markov model techniques are applied to each cluster as
well as to the whole data set. The advantage of this approach is that it improves the accuracy of lower order
markov model and disadvantage of this method is that it reduce the state space complexity of higher order
markov model.
Mathis Gery & Hatem Huddad,[18] distinguished three web mining approaches that exploit web logs:
Association Rules (AR), Frequent Sequences (FS) and Frequent Generalized Sequences (FGS). Algorithm for
three approaches were developed and experiments have been done with real web log data. Association Rule: In
data mining, the association rule learning is very popular and well researched method for discovering
interesting relations between variables in large database. Describes analyze and present strong rules discovered
in database using different measures of interestingness. In [18] The problem of finding web pages visited
together is similar to finding associations among item sets in transaction databases. Once transaction have
been identified each of them could represent a basket and each research an item. Frequent Sequences: The
attempt of this technique is to discover time ordered sequences of URLs that have been followed by past users.
Frequent Generalized Sequences (FGS): a generalized sequence is a sequence allowing wildcards in order to
reflect the users navigation in a flexible way.They have used the generalized algorithm In order to extract
frequent generalized subsequences proposed by Gaul. Author performed some experiments for this purpose
they used three collections of web log datasets. One weblog dataset for small web site, another for large website
and the third weblog dataset for intranet website. By using above three web mining approaches they evaluate
the three different types of real web log data and they found Frequent Sequence (FS) gives better accuracy than
AR and FGS.
Yi-Hung Wu and Arbee L. P. Chen,[19] proposed user behaviors by sequences of consecutive web
page accesses, derived from the access log of a proxy server. Moreover, the frequent sequences are discovered
and organized as an index. Based on the index, they propose a scheme for predicting user requests and a proxy
based framework for prefetching web pages. They perform experiments on real data. The results show that
their approach makes the predictions with a high degree of accuracy with little overhead. In the experiments,
the best hit ratio of the prediction achieves 75.69%, while the longest time to make a prediction only requires
1.9ms.The disadvantage of this experiment is that the average service rate is very low. The other problem is
the setting of the three thresholds used in the mining stage. Thesethresholds have great impacts on the
construction of the pattern trees. The use of minimum support and minimum confidence is to prune the useless
paths. Obviously, some information may be lost if the pruning effects are overestimated. On the other hand, the
5. User Navigation Pattern Prediction from Web Log Data: A Survey
| IJMER | ISSN: 2249–6645 | www.ijmer.com | Vol. 4 | Iss.10| Oct. 2014 | 41|
grouping confidence is only useful for the strongly related web pages due to some editorial techniques, such as
the embedded images and the frames.
Figure 3. The flowchart of prediction system using proxy server log.
Siriporn Chimphlee, Naomie Salim, Mohd Salihin, Bin Ngadiman , Witcha Chimphlee [20] proposed
a method for constructing first-order and second-order Markov models of Web site access prediction based on
past visitor behavior and compare it by using association rules technique. In these approaches, the session
identification technique collects the sequences of user requests, which distinguishes the requests for the same
web page in different browses. In this experiment, the three algorithms Association rules, first-order Markov
model and second-order Markov model are used. These algorithms are not successful in correctly predicting
the next request to be generated. but first-order Markov Model is best than other because it can extracted the
sequence rules and also choose the best rule for prediction and at the same time second-order decrease the
coverage. This is only due to the fact that these models do not look far into the past to discriminate correctly
the difference modes of the generative process.
IV. CONCLUSION
The conclusion based on the literature survey is that various researches had done on User Navigation
Pattern Prediction approach. In existing research various algorithms of pattern discovery techniques like graph
partition techniques of clustering, LCS and Naive Bayesian techniques of classification etc. are used for user
navigation Pattern Prediction and many types of models are developed for prediction.
World Wide Web has necessitated the users to make use of automated tools to locate desired
information resources and to follow and assess their usage pattern. Web page prefetching has been widely used
to reduce the user access latency problem of the internet; its success mainly relies on the accuracy of web page
prediction. Markov model is the most commonly used prediction model because of its high accuracy. Low
order Markov models have higher accuracy and lower coverage.
The higher order models have a number of limitations associated with i) Higher state complexity, ii)
Reduced coverage, iii) Sometimes even worse prediction accuracy. Clustering is one of the best solutions for
resolving the problem of worse prediction accuracy of Markov model. It is a powerful method for arranging
users’ session into clusters according to their similarity. We have discussed some of the techniques to overcome
the issues of web page prediction. As the web is going to expand, web usage in web databases will become
more and more. The above findings will become good guide in web page prediction effectively. In this paper,
we have presented a comprehensive survey of up-to-date researchers of web page prediction. Besides, a brief
introduction about web mining, clustering and web page prediction have also been presented. However,
research of the web page prediction is just at its beginning and much deeper understanding needs to be gained.