ICGCET 2019 | 5th International Conference on Green Computing and Engineering Technologies. The conference will be held on 7th September - 9th September 2019 in Morocco. International Conference On Engineering Technology
The conference aims to promote the work of researchers, scientists, engineers and students from across the world on advancement in electronic and computer systems.
CONTENT AND USER CLICK BASED PAGE RANKING FOR IMPROVED WEB INFORMATION RETRIEVALijcsa
Search engines today are retrieving more than a few thousand web pages for a single query, most of which
are irrelevant. Listing results according to user needs is, therefore, a very real necessity. The challenge lies
in ordering retrieved pages and presenting them to users in line with their interests. Search engines,
therefore, utilize page rank algorithms to analyze and re-rank search results according to the relevance of
the user’s query by estimating (over the web) the importance of a web page. The proposed work
investigates web page ranking methods and recently-developed improvements in web page ranking.
Further, a new content-based web page rank technique is also proposed for implementation. The proposed
technique finds out how important a particular web page is by evaluating the data a user has clicked on, as
well as the contents available on these web pages. The results demonstrate the effectiveness of the proposed
page ranking technique and its efficiency.
Web Page Recommendation Using Web MiningIJERA Editor
On World Wide Web various kind of content are generated in huge amount, so to give relevant result to user web recommendation become important part of web application. On web different kind of web recommendation are made available to user every day that includes Image, Video, Audio, query suggestion and web page. In this paper we are aiming at providing framework for web page recommendation. 1) First we describe the basics of web mining, types of web mining. 2) Details of each web mining technique.3)We propose the architecture for the personalized web page recommendation.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
CONTENT AND USER CLICK BASED PAGE RANKING FOR IMPROVED WEB INFORMATION RETRIEVALijcsa
Search engines today are retrieving more than a few thousand web pages for a single query, most of which
are irrelevant. Listing results according to user needs is, therefore, a very real necessity. The challenge lies
in ordering retrieved pages and presenting them to users in line with their interests. Search engines,
therefore, utilize page rank algorithms to analyze and re-rank search results according to the relevance of
the user’s query by estimating (over the web) the importance of a web page. The proposed work
investigates web page ranking methods and recently-developed improvements in web page ranking.
Further, a new content-based web page rank technique is also proposed for implementation. The proposed
technique finds out how important a particular web page is by evaluating the data a user has clicked on, as
well as the contents available on these web pages. The results demonstrate the effectiveness of the proposed
page ranking technique and its efficiency.
Web Page Recommendation Using Web MiningIJERA Editor
On World Wide Web various kind of content are generated in huge amount, so to give relevant result to user web recommendation become important part of web application. On web different kind of web recommendation are made available to user every day that includes Image, Video, Audio, query suggestion and web page. In this paper we are aiming at providing framework for web page recommendation. 1) First we describe the basics of web mining, types of web mining. 2) Details of each web mining technique.3)We propose the architecture for the personalized web page recommendation.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
In this world of information technology, everyone has the tendency to do business electronically. Today
lot of businesses are happening on World Wide Web (WWW), it is very important for the website owner to
provide a better platform to attract more customers for their site. Providing information in a better way is
the solution to bring more customers or users. Customer is the end-user, who accessing the information
in a way it yields some credit to the web site owners. In this paper we define web mining and present a
method to utilize web mining in a better way to know the users and website behaviour which in turn
enhance the web site information to attract more users. This paper also presents an overview of the
various researches done on pattern extraction, web content mining and how it can be taken as a catalyst
for E-business.
Web mining is the application of data mining techniques to discover patterns from the World Wide Web. As the name proposes, this is information gathered by mining the web
An Effective Approach for Document Crawling With Usage Pattern and Image Base...Editor IJCATR
As the Web continues to grow day by day each and every second a new page gets uploaded into the web; it has become
a difficult task for a user to search for the relevant and necessary information using traditional retrieval approaches. The amount of
information has increased in World Wide Web, it has become difficult to get access to desired information on Web; therefore it
has become a necessity to use Information retrieval tools like Search Engines to search for desired information on the Internet or
Web. Already Existing and used Crawling, Indexing and Page Ranking techniques that are used by the underlying Search Engines
before the result gets generated, the result sets that are returned by the engine lack in accuracy, efficiency and preciseness. The
return set of result does not really satisfy the request of the user and results in frustration on the user’s side. A Large number of
irrelevant links/pages get fetched, unwanted information, topic drift, and load on servers are some of the other issues that need to
be caught and rectified towards developing an efficient and a smart search engine. The main objective of this paper is to propose
or present a solution for the improvement of the existing crawling methodology that makes an attempt to reduce the amount of load
on server by taking advantage of computational software processes known as “Migrating Agents” for downloading the related
pages that are relevant to a particular topic only. The downloaded Pages are then provided a unique positive number i.e. called the
page has been ranked, taking into consideration the combinational words that are synonyms and other related words, user
preferences using domain profiles and the interested field of a particular user and past knowledge of relevance of a web page that
is average amount of time spent by users. A solution is also been given in context to Image based web Crawling associating the
Digital Image Processing technique with Crawling.
Enhance Crawler For Efficiently Harvesting Deep Web Interfacesrahulmonikasharma
Scenario in web is varying quickly and size of web resources is rising, efficiency has become a challenging problem for crawling such data. The hidden web content is the data that cannot be indexed by search engines as they always stay behind searchable web interfaces. The proposed system purposes to develop a framework for focused crawler for efficient gathering hidden web interfaces. Firstly Crawler performs site-based searching for getting center pages with the help of web search tools to avoid from visiting additional number of pages. To get more specific results for a focused crawler, projected crawler ranks websites by giving high priority to more related ones for a given search. Crawler accomplishes fast in-site searching via watching for more relevant links with an adaptive link ranking. Here we have incorporated spell checker for giving correct input and apply reverse searching with incremental site prioritizing for wide-ranging coverage of hidden web sites.
Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...inventionjournals
Information is overloaded in the Internet due to the unstable growth of information and it makes information search as complicate process. Recommendation System (RS) is the tool and largely used nowadays in many areas to generate interest items to users. With the development of e-commerce and information access, recommender systems have become a popular technique to prune large information spaces so that users are directed toward those items that best meet their needs and preferences. As the exponential explosion of various contents generated on the Web, Recommendation techniques have become increasingly indispensable. Web recommendation systems assist the users to get the exact information and facilitate the information search easier. Web recommendation is one of the techniques of web personalization, which recommends web pages or items to the user based on the previous browsing history. But the tremendous growth in the amount of the available information and the number of visitors to web sites in recent years places some key challenges for recommender system. The recent recommender systems stuck with producing high quality recommendation with large information, resulting unwanted item instead of targeted item or product, and performing many recommendations per second for millions of user and items. To avoid these challenges a new recommender system technologies are needed that can quickly produce high quality recommendation, even for a very large scale problems. To address these issues we use two recommender system process using fuzzy clustering and collaborative filtering algorithms. Fuzzy clustering is used to predict the items or product that will be accessed in the future based on the previous action of user browsers behavior. Collaborative filtering recommendation process is used to produce the user expects result from the result of fuzzy clustering and collection of Web Database data items. Using this new recommendation system, it results the user expected product or item with minimum time. This system reduces the result of unrelated and unwanted item to user and provides the results with user interested domain.
`A Survey on approaches of Web Mining in Varied Areasinventionjournals
There has been lot of research in recent years for efficient web searching. Several papers have proposed algorithm for user feedback sessions, to evaluate the performance of inferring user search goals. When the information is retrieved, user clicks on a particular URL. Based on the click rate, ranking will be done automatically, clustering the feedback sessions. Web search engines have made enormous contributions to the web and society. They make finding information on the web quick and easy. However, they are far from optimal. A major deficiency of generic search engines is that they follow the ‘‘one size fits all’’ model and are not adaptable to individual users.
This work describes a new system User Profile Relevant Results -
UProRevs which would filter the results given by a search engine based on the user’s profile.
“UProRevs - User Profile Relevant Results” has been published by the IEEE - Computer Society as the proceedings for the 10th International Conference on Information Technology.
Web is a collection of inter-related files on one or more web servers while web mining means extracting valuable information from web databases. Web mining is one of the data mining domains where data mining techniques are used for extracting information from the web servers. The web data includes web
pages, web links, objects on the web and web logs. Web mining is used to understand the customer behaviour, evaluate a particular website based on the information which is stored in web log files. Web mining is evaluated by using data mining techniques, namely classification, clustering, and association
rules. It has some beneficial areas or applications such as Electronic commerce, E-learning, Egovernment, E-policies, E-democracy, Electronic business, security, crime investigation and digital library. Retrieving the required web page from the web efficiently and effectively becomes a challenging task
because web is made up of unstructured data, which delivers the large amount of information and increase the complexity of dealing information from different web service providers. The collection of information becomes very hard to find, extract, filter or evaluate the relevant information for the users. In this paper,
we have studied the basic concepts of web mining, classification, processes and issues. In addition to this,
this paper also analyzed the web mining research challenges.
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...IJSRD
The development of the web in past few years has created a lot of challenge in this field. The new work in this field is the search of the data in a search tree pattern based on tree. Various sequential mining algorithms have been devoloped till date. Web usage mining is used to operate the web server logs, that contains the navigation history of the user. Recommendater system is explained properly with the explanation of whole procedure of the recommendater system. The search results of the data leads to the proper ad efficient search. But the problem was the time utilization and the search results generated from them. So, a new local search algorithm is proposed for country-wise search that makes the searching more efficient on local results basis. This approach has lead to an advancement in the search based methods and the results generated.
IDENTIFYING IMPORTANT FEATURES OF USERS TO IMPROVE PAGE RANKING ALGORITHMSZac Darcy
Web is a wide, various and dynamic environment in which different users publish their documents. Webmining is one of data mining applications in which web patterns are explored. Studies on web mining can be categorized into three classes: application mining, content mining and structure mining. Today, internet has found an increasing significance. Search engines are considered as an important tool to respond users’ interactions. Among algorithms which is used to find pages desired by users is page rank algorithm which ranks pages based on users’ interests. However, as being the most widely used algorithm by search engines including Google, this algorithm has proved its eligibility compared to similar algorithm, but considering growth speed of Internet and increase in using this technology, improving performance of this algorithm is considered as one of the web mining necessities. Current study emphasizes on Ant Colony algorithm and marks most visited links based on higher amount of pheromone. Results of the proposed algorithm indicate high accuracy of this method compared to previous methods. Ant Colony Algorithm as one of the swarm intelligence algorithms inspired by social behavior of ants can be effective in modeling social behavior of web users. In addition, application mining and structure mining techniques can be used simultaneously to improve page ranking performance.
IDENTIFYING IMPORTANT FEATURES OF USERS TO IMPROVE PAGE RANKING ALGORITHMSIJwest
Web is a wide, various and dynamic environment in which different users publish their documents. Webmining is one of data mining applications in which web patterns are explored. Studies on web mining can be categorized into three classes: application mining, content mining and structure mining. Today, internet has found an increasing significance. Search engines are considered as an important tool to respond users’ interactions. Among algorithms which is used to find pages desired by users is page rank algorithm which ranks pages based on users’ interests. However, as being the most widely used algorithm by search engines including Google, this algorithm has proved its eligibility compared to similar algorithm, but considering growth speed of Internet and increase in using this technology, improving performance of this algorithm is considered as one of the web mining necessities. Current study emphasizes on Ant Colony algorithm and marks most visited links based on higher amount of pheromone. Results of the proposed algorithm indicate high accuracy of this method compared to previous methods. Ant Colony Algorithm as one of the swarm intelligence algorithms inspired by social behavior of ants can be effective in modeling social behavior of web users. In addition, application mining and structure mining techniques can be used simultaneously to improve page ranking performance.
Identifying Important Features of Users to Improve Page Ranking Algorithms dannyijwest
Increase in number of ontologies on Semantic Web and endorsement of OWL as language of discourse for the Semantic Web has lead to a scenario where research efforts in the field of ontology engineering may be applied for making the process of ontology development through reuse a viable option for ontology developers. The advantages are twofold as when existing ontological artefacts from the Semantic Web are reused, semantic heterogeneity is reduced and help in interoperability which is the essence of Semantic Web. From the perspective of ontology development advantages of reuse are in terms of cutting down on cost as well as development life as ontology engineering requires expert domain skills and is time taking process. We have devised a framework to address challenges associated with reusing ontologies from the Semantic Web. In this paper we present methods adopted for extraction and integration of concepts across multiple ontologies. We have based extraction method on features of OWL language constructs and context to extract concepts and for integration a relative semantic similarity measure is devised. We also present here guidelines for evaluation of ontology constructed. The proposed methods have been applied on concepts from food ontology and evaluation has been done on concepts from domain of academics using Golden Ontology Evaluation Method with satisfactory outcomes
Comparable Analysis of Web Mining Categoriestheijes
Web Data Mining is the current field of analysis which is a combination of two research area known as Data Mining and World Wide Web. Web Data Mining research associates with various research diversities like Database, Artificial Intelligence and Information redeem. The mining techniques are categorized into various categories namely Web Content Mining, Web Structure Mining and Web Usage Mining. In this work, analysis of mining techniques are done. From the analysis it has been concluded that Web Content Mining has unstructured or semi- structure view of data whereas Web Structure Mining have linked structure and Web Usage Mining mainly includes interaction.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
In this world of information technology, everyone has the tendency to do business electronically. Today
lot of businesses are happening on World Wide Web (WWW), it is very important for the website owner to
provide a better platform to attract more customers for their site. Providing information in a better way is
the solution to bring more customers or users. Customer is the end-user, who accessing the information
in a way it yields some credit to the web site owners. In this paper we define web mining and present a
method to utilize web mining in a better way to know the users and website behaviour which in turn
enhance the web site information to attract more users. This paper also presents an overview of the
various researches done on pattern extraction, web content mining and how it can be taken as a catalyst
for E-business.
Web mining is the application of data mining techniques to discover patterns from the World Wide Web. As the name proposes, this is information gathered by mining the web
An Effective Approach for Document Crawling With Usage Pattern and Image Base...Editor IJCATR
As the Web continues to grow day by day each and every second a new page gets uploaded into the web; it has become
a difficult task for a user to search for the relevant and necessary information using traditional retrieval approaches. The amount of
information has increased in World Wide Web, it has become difficult to get access to desired information on Web; therefore it
has become a necessity to use Information retrieval tools like Search Engines to search for desired information on the Internet or
Web. Already Existing and used Crawling, Indexing and Page Ranking techniques that are used by the underlying Search Engines
before the result gets generated, the result sets that are returned by the engine lack in accuracy, efficiency and preciseness. The
return set of result does not really satisfy the request of the user and results in frustration on the user’s side. A Large number of
irrelevant links/pages get fetched, unwanted information, topic drift, and load on servers are some of the other issues that need to
be caught and rectified towards developing an efficient and a smart search engine. The main objective of this paper is to propose
or present a solution for the improvement of the existing crawling methodology that makes an attempt to reduce the amount of load
on server by taking advantage of computational software processes known as “Migrating Agents” for downloading the related
pages that are relevant to a particular topic only. The downloaded Pages are then provided a unique positive number i.e. called the
page has been ranked, taking into consideration the combinational words that are synonyms and other related words, user
preferences using domain profiles and the interested field of a particular user and past knowledge of relevance of a web page that
is average amount of time spent by users. A solution is also been given in context to Image based web Crawling associating the
Digital Image Processing technique with Crawling.
Enhance Crawler For Efficiently Harvesting Deep Web Interfacesrahulmonikasharma
Scenario in web is varying quickly and size of web resources is rising, efficiency has become a challenging problem for crawling such data. The hidden web content is the data that cannot be indexed by search engines as they always stay behind searchable web interfaces. The proposed system purposes to develop a framework for focused crawler for efficient gathering hidden web interfaces. Firstly Crawler performs site-based searching for getting center pages with the help of web search tools to avoid from visiting additional number of pages. To get more specific results for a focused crawler, projected crawler ranks websites by giving high priority to more related ones for a given search. Crawler accomplishes fast in-site searching via watching for more relevant links with an adaptive link ranking. Here we have incorporated spell checker for giving correct input and apply reverse searching with incremental site prioritizing for wide-ranging coverage of hidden web sites.
Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...inventionjournals
Information is overloaded in the Internet due to the unstable growth of information and it makes information search as complicate process. Recommendation System (RS) is the tool and largely used nowadays in many areas to generate interest items to users. With the development of e-commerce and information access, recommender systems have become a popular technique to prune large information spaces so that users are directed toward those items that best meet their needs and preferences. As the exponential explosion of various contents generated on the Web, Recommendation techniques have become increasingly indispensable. Web recommendation systems assist the users to get the exact information and facilitate the information search easier. Web recommendation is one of the techniques of web personalization, which recommends web pages or items to the user based on the previous browsing history. But the tremendous growth in the amount of the available information and the number of visitors to web sites in recent years places some key challenges for recommender system. The recent recommender systems stuck with producing high quality recommendation with large information, resulting unwanted item instead of targeted item or product, and performing many recommendations per second for millions of user and items. To avoid these challenges a new recommender system technologies are needed that can quickly produce high quality recommendation, even for a very large scale problems. To address these issues we use two recommender system process using fuzzy clustering and collaborative filtering algorithms. Fuzzy clustering is used to predict the items or product that will be accessed in the future based on the previous action of user browsers behavior. Collaborative filtering recommendation process is used to produce the user expects result from the result of fuzzy clustering and collection of Web Database data items. Using this new recommendation system, it results the user expected product or item with minimum time. This system reduces the result of unrelated and unwanted item to user and provides the results with user interested domain.
`A Survey on approaches of Web Mining in Varied Areasinventionjournals
There has been lot of research in recent years for efficient web searching. Several papers have proposed algorithm for user feedback sessions, to evaluate the performance of inferring user search goals. When the information is retrieved, user clicks on a particular URL. Based on the click rate, ranking will be done automatically, clustering the feedback sessions. Web search engines have made enormous contributions to the web and society. They make finding information on the web quick and easy. However, they are far from optimal. A major deficiency of generic search engines is that they follow the ‘‘one size fits all’’ model and are not adaptable to individual users.
This work describes a new system User Profile Relevant Results -
UProRevs which would filter the results given by a search engine based on the user’s profile.
“UProRevs - User Profile Relevant Results” has been published by the IEEE - Computer Society as the proceedings for the 10th International Conference on Information Technology.
Web is a collection of inter-related files on one or more web servers while web mining means extracting valuable information from web databases. Web mining is one of the data mining domains where data mining techniques are used for extracting information from the web servers. The web data includes web
pages, web links, objects on the web and web logs. Web mining is used to understand the customer behaviour, evaluate a particular website based on the information which is stored in web log files. Web mining is evaluated by using data mining techniques, namely classification, clustering, and association
rules. It has some beneficial areas or applications such as Electronic commerce, E-learning, Egovernment, E-policies, E-democracy, Electronic business, security, crime investigation and digital library. Retrieving the required web page from the web efficiently and effectively becomes a challenging task
because web is made up of unstructured data, which delivers the large amount of information and increase the complexity of dealing information from different web service providers. The collection of information becomes very hard to find, extract, filter or evaluate the relevant information for the users. In this paper,
we have studied the basic concepts of web mining, classification, processes and issues. In addition to this,
this paper also analyzed the web mining research challenges.
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...IJSRD
The development of the web in past few years has created a lot of challenge in this field. The new work in this field is the search of the data in a search tree pattern based on tree. Various sequential mining algorithms have been devoloped till date. Web usage mining is used to operate the web server logs, that contains the navigation history of the user. Recommendater system is explained properly with the explanation of whole procedure of the recommendater system. The search results of the data leads to the proper ad efficient search. But the problem was the time utilization and the search results generated from them. So, a new local search algorithm is proposed for country-wise search that makes the searching more efficient on local results basis. This approach has lead to an advancement in the search based methods and the results generated.
IDENTIFYING IMPORTANT FEATURES OF USERS TO IMPROVE PAGE RANKING ALGORITHMSZac Darcy
Web is a wide, various and dynamic environment in which different users publish their documents. Webmining is one of data mining applications in which web patterns are explored. Studies on web mining can be categorized into three classes: application mining, content mining and structure mining. Today, internet has found an increasing significance. Search engines are considered as an important tool to respond users’ interactions. Among algorithms which is used to find pages desired by users is page rank algorithm which ranks pages based on users’ interests. However, as being the most widely used algorithm by search engines including Google, this algorithm has proved its eligibility compared to similar algorithm, but considering growth speed of Internet and increase in using this technology, improving performance of this algorithm is considered as one of the web mining necessities. Current study emphasizes on Ant Colony algorithm and marks most visited links based on higher amount of pheromone. Results of the proposed algorithm indicate high accuracy of this method compared to previous methods. Ant Colony Algorithm as one of the swarm intelligence algorithms inspired by social behavior of ants can be effective in modeling social behavior of web users. In addition, application mining and structure mining techniques can be used simultaneously to improve page ranking performance.
IDENTIFYING IMPORTANT FEATURES OF USERS TO IMPROVE PAGE RANKING ALGORITHMSIJwest
Web is a wide, various and dynamic environment in which different users publish their documents. Webmining is one of data mining applications in which web patterns are explored. Studies on web mining can be categorized into three classes: application mining, content mining and structure mining. Today, internet has found an increasing significance. Search engines are considered as an important tool to respond users’ interactions. Among algorithms which is used to find pages desired by users is page rank algorithm which ranks pages based on users’ interests. However, as being the most widely used algorithm by search engines including Google, this algorithm has proved its eligibility compared to similar algorithm, but considering growth speed of Internet and increase in using this technology, improving performance of this algorithm is considered as one of the web mining necessities. Current study emphasizes on Ant Colony algorithm and marks most visited links based on higher amount of pheromone. Results of the proposed algorithm indicate high accuracy of this method compared to previous methods. Ant Colony Algorithm as one of the swarm intelligence algorithms inspired by social behavior of ants can be effective in modeling social behavior of web users. In addition, application mining and structure mining techniques can be used simultaneously to improve page ranking performance.
Identifying Important Features of Users to Improve Page Ranking Algorithms dannyijwest
Increase in number of ontologies on Semantic Web and endorsement of OWL as language of discourse for the Semantic Web has lead to a scenario where research efforts in the field of ontology engineering may be applied for making the process of ontology development through reuse a viable option for ontology developers. The advantages are twofold as when existing ontological artefacts from the Semantic Web are reused, semantic heterogeneity is reduced and help in interoperability which is the essence of Semantic Web. From the perspective of ontology development advantages of reuse are in terms of cutting down on cost as well as development life as ontology engineering requires expert domain skills and is time taking process. We have devised a framework to address challenges associated with reusing ontologies from the Semantic Web. In this paper we present methods adopted for extraction and integration of concepts across multiple ontologies. We have based extraction method on features of OWL language constructs and context to extract concepts and for integration a relative semantic similarity measure is devised. We also present here guidelines for evaluation of ontology constructed. The proposed methods have been applied on concepts from food ontology and evaluation has been done on concepts from domain of academics using Golden Ontology Evaluation Method with satisfactory outcomes
Comparable Analysis of Web Mining Categoriestheijes
Web Data Mining is the current field of analysis which is a combination of two research area known as Data Mining and World Wide Web. Web Data Mining research associates with various research diversities like Database, Artificial Intelligence and Information redeem. The mining techniques are categorized into various categories namely Web Content Mining, Web Structure Mining and Web Usage Mining. In this work, analysis of mining techniques are done. From the analysis it has been concluded that Web Content Mining has unstructured or semi- structure view of data whereas Web Structure Mining have linked structure and Web Usage Mining mainly includes interaction.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Recommendation generation by integrating sequential pattern mining and semanticseSAT Journals
Abstract As the Internet usage keeps increasing, the number of web sites and hence the number of web pages also keeps increasing. A recommendation system can be used to provide personalized web service by suggesting the pages that are likely to be accessed in future. Most of the recommendation systems are based on association rule mining or based on keywords. Using the association rule mining the prediction rate is less as it doesn’t take into account the order of access of the web pages by the users. The recommendation systems that are key-word based provides lesser relevant results. This paper proposes a recommendation system that uses the advantages of sequential pattern mining and semantics over the association rule mining and keyword based systems respectively. Keywords: Sequential Pattern Mining, Taxonomy, Apriori-All, CS-Mine, Semantic, Clustering
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
GOOGLE SEARCH ALGORITHM UPDATES AGAINST WEB SPAMieijjournal
With the search engines' increasing importance in people's life, there are more and more attempts to illegitimately influence page ranking by means of web spam. Web spam detection is becoming a major challenge for internet search providers. The Web contains a huge number of profit-seeking ventures that are attracted by the prospect of reaching millions of users at a very low cost. There is an economic incentive for manipulating search engine’s listings by creating otherwise useless pages that score high ranking in the search results. Such manipulation is widespread in the industry. There is a large gray area between the Ethical SEO and the Unethical SEO i.e. spam industry. SEO services range from making web pages indexed, to the creation of millions of fake web pages to deceive search engine ranking algorithms. Today's search engines need to adapt their ranking algorithms continuously to mitigate the effect of spamming tactics on their search results. Search engine companies keep their search ranking algorithms and ranking features secret to protect their ranking system from gaming by spam tactics. We tried to collect here all the Google's search algorithm updates from different sources which are against web spam.
International Journal of Engineering Research and DevelopmentIJERD Editor
Electrical, Electronics and Computer Engineering,
Information Engineering and Technology,
Mechanical, Industrial and Manufacturing Engineering,
Automation and Mechatronics Engineering,
Material and Chemical Engineering,
Civil and Architecture Engineering,
Biotechnology and Bio Engineering,
Environmental Engineering,
Petroleum and Mining Engineering,
Marine and Agriculture engineering,
Aerospace Engineering.
Intelligent Web Crawling (WI-IAT 2013 Tutorial)Denis Shestakov
<<< Slides can be found at http://www.slideshare.net/denshe/intelligent-crawling-shestakovwiiat13 >>>
-------------------
Web crawling, a process of collecting web pages in
an automated manner, is the primary and ubiquitous operation used by a large number of web systems and agents starting from a simple program for website backup to a major web search engine. Due to an astronomical amount of data already published on the Web and ongoing exponential growth of web content, any party that want to take advantage of massive-scale web data faces a high barrier to entry. We start with background on web crawling and the structure of the Web. We then discuss different crawling strategies and describe adaptive web crawling techniques leading to better overall crawl performance. We finally overview some of the challenges in web crawling by presenting such topics as collaborative web crawling, crawling the deep Web and crawling multimedia content. Our goals are to introduce the intelligent systems community to the challenges in web crawling research, present intelligent web crawling approaches, and engage researchers and practitioners for open issues and research problems. Our presentation could be of interest to web intelligence and intelligent agent technology communities as it particularly focuses on the usage of intelligent/adaptive techniques in the web crawling domain.
-------------------
Web is a collection of inter-related files on one or more web servers while web mining means extracting
valuable information from web databases. Web mining is one of the data mining domains where data
mining techniques are used for extracting information from the web servers. The web data includes web
pages, web links, objects on the web and web logs. Web mining is used to understand the customer
behaviour, evaluate a particular website based on the information which is stored in web log files. Web
mining is evaluated by using data mining techniques, namely classification, clustering, and association
rules. It has some beneficial areas or applications such as Electronic commerce, E-learning, Egovernment, E-policies, E-democracy, Electronic business, security, crime investigation and digital library.
Retrieving the required web page from the web efficiently and effectively becomes a challenging task
because web is made up of unstructured data, which delivers the large amount of information and increase
the complexity of dealing information from different web service providers. The collection of information
becomes very hard to find, extract, filter or evaluate the relevant information for the users. In this paper,
we have studied the basic concepts of web mining, classification, processes and issues. In addition to this,
this paper also analyzed the web mining research challenges.
Web is a collection of inter-related files on one or more web servers while web mining means extracting
valuable information from web databases. Web mining is one of the data mining domains where data
mining techniques are used for extracting information from the web servers. The web data includes web
pages, web links, objects on the web and web logs. Web mining is used to understand the customer
behaviour, evaluate a particular website based on the information which is stored in web log files. Web
mining is evaluated by using data mining techniques, namely classification, clustering, and association
rules. It has some beneficial areas or applications such as Electronic commerce, E-learning, Egovernment, E-policies, E-democracy, Electronic business, security, crime investigation and digital library.
Retrieving the required web page from the web efficiently and effectively becomes a challenging task
because web is made up of unstructured data, which delivers the large amount of information and increase
the complexity of dealing information from different web service providers. The collection of information
becomes very hard to find, extract, filter or evaluate the relevant information for the users. In this paper,
we have studied the basic concepts of web mining, classification, processes and issues. In addition to this,
this paper also analyzed the web mining research challenges.
Web is a collection of inter-related files on one or more web servers while web mining means extracting
valuable information from web databases. Web mining is one of the data mining domains where data
mining techniques are used for extracting information from the web servers. The web data includes web
pages, web links, objects on the web and web logs. Web mining is used to understand the customer
behaviour, evaluate a particular website based on the information which is stored in web log files. Web
mining is evaluated by using data mining techniques, namely classification, clustering, and association
rules. It has some beneficial areas or applications such as Electronic commerce, E-learning, Egovernment, E-policies, E-democracy, Electronic business, security, crime investigation and digital library.
Retrieving the required web page from the web efficiently and effectively becomes a challenging task
because web is made up of unstructured data, which delivers the large amount of information and increase
the complexity of dealing information from different web service providers. The collection of information
becomes very hard to find, extract, filter or evaluate the relevant information for the users. In this paper,
we have studied the basic concepts of web mining, classification, processes and issues. In addition to this,
this paper also analyzed the web mining research challenges.
Web is a collection of inter-related files on one or more web servers while web mining means extracting valuable information from web databases. Web mining is one of the data mining domains where data mining techniques are used for extracting information from the web servers. The web data includes web
pages, web links, objects on the web and web logs. Web mining is used to understand the customer behaviour, evaluate a particular website based on the information which is stored in web log files. Web mining is evaluated by using data mining techniques, namely classification, clustering, and association
rules. It has some beneficial areas or applications such as Electronic commerce, E-learning, Egovernment, E-policies, E-democracy, Electronic business, security, crime investigation and digital library. Retrieving the required web page from the web efficiently and effectively becomes a challenging task
because web is made up of unstructured data, which delivers the large amount of information and increase the complexity of dealing information from different web service providers. The collection of information becomes very hard to find, extract, filter or evaluate the relevant information for the users. In this paper,
we have studied the basic concepts of web mining, classification, processes and issues. In addition to this,
this paper also analyzed the web mining research challenges.
Similar to International conference On Computer Science And technology (20)
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
International conference On Computer Science And technology
1. Gyancity Journal of Electronics and Computer Science
Vol.3, No.2, pp. 1-8, September 2018
ISSN:2446-2918 DOI: 10.21058/gjecs.2018.32001
1
Application of Page Ranking Algorithm Based on
Numbers of Link Visits in Web Recommendation
System for Online Business
Abstract- World Wide Web plays a vital role in global information service center. Online business is growing very rapidly by creating websites.
Website is made of number of webpages. Webpage is dynamic collection of hyperlink and usage information, providing rich sources of web data
for web mining. Due to exponential growth of dynamic information over the internet, information overload create big challenges for the researchers
in this area. One of the key components which ensure the acceptance of web page search service is the web page ranker a component which is said
to have been the main contributing factor of Google. This paper discusses the page ranking algorithms based on contents, structures and usages in
web mining, proposed page ranking algorithm based on number of links visit. Since every product has its own unique web page for its decryption
hence in selection of top most visited links can be applicable to create the recommendation list for the corresponding customer in online business.
Keywords-- PageRank, Web Mining, Recommendation System, Information retrieval, Web Graph, Online business.
1. Introduction
Web Mining is defined as the application of data mining techniques on the World Wide Web to find hidden information, This hidden
information i.e. knowledge could be contained in content of web pages or in link structure of WWW [8] or in web server logs. WWW
is a vast resource of hyperlinked and heterogeneous information including text, image, audio, video, and metadata. With the rapid growth
of information sources available on the WWW and growing needs of users, it is becoming difficult to manage the information on the
web and satisfy the user needs. Actually, we are drowning in data but starving for knowledge. Therefore, it has become increasingly
necessary for users to use some information retrieval techniques to find, extract, filter and order the desired information.
Search engine [2] receives users query, processes it, and searches into its index for relevant documents i.e. the documents that are likely
related to query and supposed to be interesting then, search engine ranks the documents found relevant and it shows them as results.
This process can be divided in the following tasks:
Crawler [11,12] is in charge of visiting as many pages and retrieves the information needed from them. The idea is that this information
is stored for the use by the search engine afterwards.
Indexing the information provided by a crawler has to be stored in order to be accessed by the search engine. As the user will be in front
of his computer waiting for the answer of the search engine, time response becomes an important issue. That is why this information is
indexed in order to decrease the time needed to look into it.
Searching: The web search engine represents the user interface needed to permit the user to query the information. It is the connection
between the user and the information repository.
Sorting/Ranking Due to the huge amount of information existing in the web, when a user sends a query about a general topic (e.g. java
course), there exist an incredible number of pages related to this query but only a small part of such amount of information will be really
interesting for the user. That is why the search engines incorporate ranking algorithms in order to sort the results.
Recommendation System There is individual page for every product and services provided by online business. Page ranking algorithm
rank every page which gives the popularity of the web page [9,10] that can be used to filter the most popular product.
Today is the era of online businesses. There are billions of small and large companies that provide the online facilities of sale and
purchase the product and services. Every product has the different links of its description. Hence PageRank is help to identify the
popularity of the product.
This paper is categorized into seven parts. First part discusses the idea about the web mining and its types. Second part is the related
study of page ranking algorithms; third part discusses the problems identification, forth part is implementation of page ranking algorithm
based on number of link visits, fifth part discusses its application in online business, six part is comparison analysis, and last part
discusses the conclusion of the paper.
2
Om Prakash Rishi
Director Research,
University of Kota, Kota,
Rajasthan, India
omprakashrishi@yahoo.com
1
Mahesh Kumar Singh
Research Scholar,
Department of Computer Science
and Informatics.
University of Kota, Kota,
Rajasthan, India.
maheshkrsg@gmail.com
3
Sumit Wadhwa
Assistant Professor,
Department of Information
technology & Management,
Greater Noida, U.P., India
sumitwadhwa1988@gmail.com
2. Gyancity Journal of Electronics and Computer Science
Vol.3, No.2, pp. 1-10, September 2018
ISSN:2446-2918 DOI: 10.21058/gjecs.2018.32001
2
2. Web Mining
Extraction of interesting (non-trivial, implicit, previously unknown and potentially useful) information or patterns from large databases
is called Data Mining. Web Mining is the application of data mining techniques to discover and retrieve useful information and patterns
(knowledge) from the WWW documents and services web mining can be divided into three categories [1] as shown in figure 2.1.
a) Web Content Mining
b) Web Structure Mining
c) Web Usage Mining
2.1 Web Content Mining (WCM) describes the automatic search of information resources available online, and involves mining web
data content. It is emphasis on the content of the web page not its links. It can be applied on web pages itself or on the result pages
obtained from a search engine. WCM is differentiated from two different points of view: Information Retrieval (IR) View and Database
View. In IR view, most of the researches use bag of words, which is based on the statistics about single words in isolation, to represent
unstructured text. For the semi-structured data, all the works utilize the HTML structures insides the documents. For database view,
Web mining always tries to infer the structure of the Web site to transform a Web site to become a database.
2.2 Web Structure Mining (WSM) is used to generate structural summary about the Web sites and Web pages. The structure of a typical
Web graph consists of Web pages as nodes and hyperlinks as edges connecting two related pages. Technically, WCM mainly focuses
on the structure of inner-document, while WSM tries to discover the link structure of the hyperlinks at the inter-document level.
Web structure mining tries to discover the model underlying the link structures of the Web. The model is based on the topology of the
hyperlink with or without the link description. This model can be used to categorize the Web pages and is useful to generate information
such as similarity and relationships between Web sites. And the link structure of the Web contains important implied information, and
can help in filtering or ranking Web pages. In particular, a link from page A to page B can be considered a recommendation of page B
by the author of A. Some new algorithms have been proposed that exploit this link structure not only for keyword searching, but other
tasks like automatically building a Yahoo-like hierarchy or identifying communities on the Web. The qualitative performance of these
algorithms is generally better than the IR algorithms since they make use of more information than just the contents of the pages. While
it is indeed possible to influence the link structure of the Web locally, it is quite hard to do so at a global level. So link analysis algorithms
that work at a global level possess relatively robust defenses against spamming.
Fig 2.1 Taxonomy of Web Mining
2.3 Web Usage Mining (WUM) tries to discover user navigation patterns from web data and the useful information from the secondary
data derived from the interactions of the users while surfing on the Web. It focuses on the techniques that could predict user behavior
while the user interacts with Web. This type of web mining allows for the collection of Web access information for Web pages. This
usage data provides the paths leading to accessed Web pages. This information is often gathered automatically into access logs via the
Web server. CGI scripts offer other useful information such as referrer logs, user subscription information and survey logs. This category
is important to the overall use of data mining for companies and their internet/ intranet based applications and information access.
The three categories of web mining described above have its own application areas including site improvement, business intelligence,
Web personalization, site modification, usage characterization and page ranking etc. The search engines to find more important pages
generally use the page ranking. Implement PRNLV method use web structure and web uses mining technique to rank web pages and its
application in web recommendation system.
3. Related Work of Ranking Algorithms
3. Gyancity Journal of Electronics and Computer Science
Vol.3, No.2, pp. 1-10, September 2018
ISSN:2446-2918 DOI: 10.21058/gjecs.2018.32001
3
The web is very large and diverse and many pages could be related to a given query. That is why a method/algorithm is used to sort the
entire pages subject to be interesting to a user’s query. All the algorithms consider the web pages as a directed graph in which pages are
denoted as nodes and links are denoted as edges.
3.1 PageRank Algorithm (PR): Surgey Brin and Larry Page developed a ranking algorithm used by Google, named
PageRank [2] after Larry Page (cofounder of Google search engine), that uses the link structure of the web to determine the
importance of web pages [7]. It takes back links into account and propagates the ranking through links. Thus, a page has a high
rank if the sum of the ranks of its back links is high. A simplified version of page rank is defined as follows
(3.1)
In the calculation of PageRank a factor c is used for normalization. Note that 0<c < 1 because there are pages without incoming links
and their weight is lost. Later PageRank was modified observing that not all users follow the direct links on WWW.
(3.2)
Where d is a dampening factor that is usually set to 0.85 (any value between 0 and 1), d can be thought of as the probability of users’
following the links and could regard (1 − d) as the page rank distribution from non-directly linked pages .Consider the following directed
graph [4]
Fig 3.1 Example graph
The PageRanks for pages A, B, C in figure 3.1 are calculated by using (3.2) with d=0.5, the page ranks of pages A, B and C
becomes:PR(A)=1.2, PR(B)=1.2, PR(C)=0.8
3.2 Weighted Page Rank Algorithm (WPR):
Wenpu Xing and Ali Ghorbani [5] proposed an extension to standard PageRank called Weighted PageRank (WPR). It rank pages
according to their importance not only consider link structure of web graph. This algorithm assigns larger rank values to more important
pages instead of dividing the rank value of a page evenly among its outgoing linked pages. Each outlink page gets a value proportional
to its popularity. The popularity is measured by its number of inlinks and outlinks.[6]
Where Iv,Ip and Ov ,Op represent the number of inlinks and outlinks of page v and page p respectively. The Page Ranks for pages A,
B, C are calculated by using (3.3) with d=0.5, the page ranks of pages A, B and C are PR (A)=0.65, PR (B)=0.93, PR(C)=0.60.
3.3Page Content Rank Algorithm (PCR):
Jaroslav Pokorny and Jozef Smizansky[3] gave a new ranking method of page relevance ranking employing WCM technique, called
Page Content Rank (PCR). This method combines a number of heuristics that seem to be important for analyzing the content of web
pages. The page importance is determined on the basis of the importance of terms, which the page contains. The importance of a term
is specified with respect to a given query q. PCR uses a neural network as its inner classification structure. The importance of a page P
in PCR is calculated as an aggregate value of the importance of all terms that P contains. For a promotion of the significant term and a
suppression of the others, the second moment is again used as an aggregate function [2]
4. Gyancity Journal of Electronics and Computer Science
Vol.3, No.2, pp. 1-10, September 2018
ISSN:2446-2918 DOI: 10.21058/gjecs.2018.32001
4
Page_importance(P)=sec_moment({importance(t): t P}) (3.4)
3.4 Hyperlinked Induced Topic Search Algorithm (HITS) [13, 24]:
This algorithm assumes that for every query topic, there is a set of "authoritative" or "authority" pages/sites that are relevant and popular
focusing on the topic and there are "hub" pages/sites that contain useful links to relevant sites including links to many related authorities
as shown in figure 3.2.
Fig 3.2 Hubs and Authorities
Working of HITS: The HITS works in two phases Sampling and Iterative in the Sampling phase a set of relevant pages for the given
query are collected i.e. a sub-graph S of G is retrieved which is high in authority pages. The Iterative phase finds hubs and authorities
using the output of the sampling phase using following equations.
Where Hp is the hub weight, Ap is the Authority weight, I (p) and B(p) denotes the set of reference and referrer pages of page p.
4.0 Problems and Issues of the Page Ranking Algorithms
The main problems and issues of discussed page ranking algorithms are summarized as:
4.1 Rank quality of PageRank: The discussed ranking algorithms have shown a really high quality and the proof is that success of
Google (or they are still being used) successfully. However, some improvements can be done on it
4.2 Data Mining Technique of PageRank: PageRank algorithm used only Web Structure Mining and Web Content Mining technique; it
does not use Web Usage Mining, which may significantly improve the quality of rank of web pages according to users information
needs.
4.3 PageRank is Static in Nature: In PageRank algorithm, the importance or rank score of each page are static in nature. The rank changes
only with link structure of web.
5. Page Ranking Algorithm Based on Numbers of Link Visits (PRNLV)
PRNLV (Page Ranking based on Link Visits) based on Web Structure Mining and Usage Mining; it takes the user visits of pages/links
into account with the aim to determine the importance and relevance score of the web pages. To accomplish the complete task from
gathering the usage characterization till final rank determination many subtasks are performed such as
a)Storage of user’s access information (hits) on an outgoing link of a page in related server log files.
b)Fetching of pages and their access information by the targeted web crawler.
c)For each page link, computation of weights based on the probabilities of their being visited by the users.
d)Final rank computation of pages based on the weights of their incoming links.
e)Retrieval of ranked pages corresponding to user queries.
5.1 Calculation of Visits (hits) of links
If p is a page with outgoing-link set O(p) and each outgoing link is associated with a numerical integer indicating visitcount (VC), then
the weight of each outgoing link connecting to page p to page o is calculated by
5. Gyancity Journal of Electronics and Computer Science
Vol.3, No.2, pp. 1-10, September 2018
ISSN:2446-2918 DOI: 10.21058/gjecs.2018.32001
5
where d is the damping factor as is used in PageRank, Weightlink is the weight of the link calculated by (5.1).The iteration method is
used for the calculation of page rank. Example fig 3.1 Taking d=0.5, these equations can easily be solved using iteration method the
final results obtained are:
PRNLV(A)= 1.08, PRNLV(B)= 1.26, PRNLV(C)= 0.66
Fig 5.2 Numbers of the Hits of the links
5.2 Experimental Evaluation:
To evaluate the performance and effectiveness of the proposed algorithm a prototype is designed a website is created using HTML,CSS
and Java script, wamp server is used as the server to store the activities of the users. The web graph of the web site as shown in figure
5.1. The data are collected by installing the model on the local host and uses are able to log the web site using their email as the ID,
visit to the different links. Every page has different URL with their visit count as shown in figure 5.2.Hence compute the page rank of
each link (page) using the formula (5.2) and arrange these links in its decreasing order of their page rank as shown in figure 5.3. Hence
the highest ranked page on the top of the list. The variation of page rank of the pages in the web site shown in the figure 5.4.
Fig 5.3 Page Rank Using (PR and PRNLV) Fig 5.4 Variation of PRNLV with PR
Variation of PRNLV with PR
Web Pages
PR
PRNLV
6. Gyancity Journal of Electronics and Computer Science
Vol.3, No.2, pp. 1-10, September 2018
ISSN:2446-2918 DOI: 10.21058/gjecs.2018.32001
6
6. Application of PageRank in Recommendation System:
Every on line commercial company uses the specified pages for specified product that contains the specification of the product on their
web site. Every page has its own unique URL hence the popularity of the product can be computed by the popularity of the links that is
page rank of the page. Recommendation System tries to identify the user’s interest in the specific domain of contents based on their
previous experiences. When a user interact with the E-commercial site heshe offers a set of implicit or explicit information like clicks,
rating, comments etc. about his/her taste. Association Rule mining play a vital to associate the e-mail ID with the page with its rank that
can be helpful in creating the recommended list for the specified customer.
7. Comparisons of PRNLV with PR and WPR: From the table 1 it is clear that PRNLV is much better than PR and WPR.
Table 1 Comparison of Page ranking Algorithms
Algorithm
Parameter
PageRank (PR)
Weighted
Page Rank
(WPR)
Page Rank based on
Numbers Link-Visit
(PRNLV)
Description
Computes
scores at
indexing time.
Results are
sorted
according to
importance of
pages.
Computes
scores at
indexing time.
Results are
sorted
according to
importance of
pages.
Computes scores at
indexing time. Pages
are sorted according to
importance and
relevance.
Mining
Technique
Used
Web
Structure
Mining
Web
Structure
Mining
Web Structure
Mining, Web Usage
Mining
Rank Ranks are Ranks are Ranks are unequally
Distribution equally
distributed to
outgoing links.
equally
distributed to
outgoing links.
distributed among
outgoing links
according to their
probabilities of visit.
I/P
Parameters
Inbound links of
pages
Inbound links
and Outbound
links of pages
Inbound links,
Outbound links,
Visit Counts of
links.
Working
levels
n* n* N
Complexity O(log n) O(log n) > O(log n)
Nature of
Rank
Less dynamic
(rank changes
with link
structure )
Less dynamic
(rank changes
with link
structure )
More dynamic (rank
changes with visit
counts & structure of
links)
Relevancy of
pages
No No Yes
Importance of
pages
Yes Yes Yes
Quality of
result
Low High High
7. Gyancity Journal of Electronics and Computer Science
Vol.3, No.2, pp. 1-10, September 2018
ISSN:2446-2918 DOI: 10.21058/gjecs.2018.32001
7
Advantages
Computation of
ranks with
minimum
effort and less
complexity.
Computation
of ranks with
minimum
effort and less
complexity.
Pages returned are of
high quality and
relevancy as user
feedbacks are taken
into account. Search
space can be very
much pruned as pages
are sorted according to
users’ information
needs.
Limitations
No relevancy of
pages is
considered in
rank
computation.
All links are
considered
equally
important.
No relevancy of
pages is
considered in
rank
computation.
All links are
considered
equally
important.
Extra effort on crawlers
to fetch the visit counts
of pages from web
servers. Extra
calculations to find the
weights of links.
8. Conclusion
Mining of knowledgeable data from a huge amount of data is very complex task, World Wide Web information play a vital role for
information collection and sharing. The ranking algorithms are used to search the relevance information in very efficient manner.
Different page ranking algorithms are used in different techniques. The PRNLV uses the user‘s browsing information in consideration
to calculate rank of a documents rather than link structure. Due to browsing information in consideration PRNLV system is more
dynamic than other ranking algorithms. The popularity of the web pages (products) computed by the page ranking algorithm can be used
for performance of the products in the on line business.
REFERENCES
[1] N. Duhan, A. K. Sharma, K. K. Bhatia, “Page Ranking Algorithms: A Survey”, 2009 IEEE International Advance Computing Conference
(IACC 2009) Patiala, India, 6-7 March 2009.
[2] A Broder. A taxonomy of web search. Technical report, IBM Research, 2002.
[3] Jaroslav Pokorny, Jozef Smizansky, Page Content Rank: An Approach to the Web Content Mining.
[4] A. Broder, R. Kumar, F. Maghoul, P. Raghavan, R. Stata. Graph structure in the web. In In Proceedings of the 9th International World Wide
Web Conference, 2000.
[5] W Xing,A. Ghorbani, “Weighted PageRank Algorithm”, Proceedings of the Second Annual Conference on Communication Networks and
Services Research (CNSR‘04), 2004 IEEE
[6] G. Salton . C. Buckley. “Term Weighting Approaches in Automatic Text Retrieval. In Information Processing and Management. Vol. 24, No.
5, pp. 513– 523 1998.
[7] S. Chakrabarti, B. E. Dom, S. R. Kumar, P. Raghavan, S. Rajagopalan, A. Tomkins, D. Gibson, and J. Kleinberg, Mining the Web‘s link
structure. Computer, 32(8):60–67, 1999.
[8] T. B.-Lee, R. Cailliau, A. Luotonen, H.F. Nielsen, A. Secret. “The World-Wide Web. Communications of the ACM”, 37(8):76–82, 1994.
[9] M.G Campana, F. Delmastro:” Recommernder Systems for online and mobile Social Networks: A survey”, Online Social Networks and
Media3-4(2017) , pp 75-97
[10] Q. Su , L. Chem:” A Method for discovering cluster of e-Commerce in terest patterns using Click- stream data”. Electronic
Commerce Research and Applications. 1567-4223 Elsevier-2014 http://dx.doi.org/10.1016/j.elerap.2014.10.002
[11] B. Pinkerton.” Finding what people want: Experiences with the web crawler”. In The second Internation WWW Conference Chicago, 1994.
[12] J. Cho, H. Garc´ýa-Molina, L. Page. “Efficient crawling through URL ordering”. Computer Networks and ISDN Systems, 30(1–7):161–172,
1998.
[13] C. Ding, X. He, P. Husbands, H. Zha, H. Simon. “Link analysis: Hubs and authorities on the world”, Technical report:47847, 2001.
[14] A. Borodin, G. O. Roberts, Jeffrey S. Rosenthal, P. Tsaparas. Finding authorities and hubs from link structures on the world wide web. In
World Wide Web, pages 415–429, 2001.
[15] S. L. Yong, M. Hagenbuchner Ah Chung Tsoi, “Ranking Web Pages using Machine Learning Approaches”, IEEE/WIC/ACM International
Conference on Web Intelligence and Intelligent Agent Technology, 2008.
8. Gyancity Journal of Electronics and Computer Science
Vol.3, No.2, pp. 1-10, September 2018
ISSN:2446-2918 DOI: 10.21058/gjecs.2018.32001
8
Author’s Profile
Mahesh Kumar Singh received his B.Tech from UPTU Lucknow in 2004 , M.Tech. from Jamia Hamdard
University, New Delhi in 2012 and Perusing Ph D in Computer Science, Kota University on web mining and
its application to web business intelligence.
Dr. Om Prakash Rishi is Director of Research University of Kota and Head of Department of Computer
Science & Informatics. His Research areas are Artificial Intelligence, Intelligent Systems & CBR, Software
Systems and Agent technology, Cloud Computing, Information Security.
Sumit Wadhwa received his B.Tech from Kurukshetra University, Kurukshetra in 2009 , M.Tech. from
Kurukshetra University, Kurukshetra, in 2013 and research areas are Wireless Sensor Network & Artifical
intelligence.