The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
Web is a collection of inter-related files on one or more web servers while web mining means extracting valuable information from web databases. Web mining is one of the data mining domains where data mining techniques are used for extracting information from the web servers. The web data includes web
pages, web links, objects on the web and web logs. Web mining is used to understand the customer behaviour, evaluate a particular website based on the information which is stored in web log files. Web mining is evaluated by using data mining techniques, namely classification, clustering, and association
rules. It has some beneficial areas or applications such as Electronic commerce, E-learning, Egovernment, E-policies, E-democracy, Electronic business, security, crime investigation and digital library. Retrieving the required web page from the web efficiently and effectively becomes a challenging task
because web is made up of unstructured data, which delivers the large amount of information and increase the complexity of dealing information from different web service providers. The collection of information becomes very hard to find, extract, filter or evaluate the relevant information for the users. In this paper,
we have studied the basic concepts of web mining, classification, processes and issues. In addition to this,
this paper also analyzed the web mining research challenges.
International Journal of Engineering Research and Development (IJERD)IJERD Editor
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
In this world of information technology, everyone has the tendency to do business electronically. Today
lot of businesses are happening on World Wide Web (WWW), it is very important for the website owner to
provide a better platform to attract more customers for their site. Providing information in a better way is
the solution to bring more customers or users. Customer is the end-user, who accessing the information
in a way it yields some credit to the web site owners. In this paper we define web mining and present a
method to utilize web mining in a better way to know the users and website behaviour which in turn
enhance the web site information to attract more users. This paper also presents an overview of the
various researches done on pattern extraction, web content mining and how it can be taken as a catalyst
for E-business.
Web is a collection of inter-related files on one or more web servers while web mining means extracting valuable information from web databases. Web mining is one of the data mining domains where data mining techniques are used for extracting information from the web servers. The web data includes web
pages, web links, objects on the web and web logs. Web mining is used to understand the customer behaviour, evaluate a particular website based on the information which is stored in web log files. Web mining is evaluated by using data mining techniques, namely classification, clustering, and association
rules. It has some beneficial areas or applications such as Electronic commerce, E-learning, Egovernment, E-policies, E-democracy, Electronic business, security, crime investigation and digital library. Retrieving the required web page from the web efficiently and effectively becomes a challenging task
because web is made up of unstructured data, which delivers the large amount of information and increase the complexity of dealing information from different web service providers. The collection of information becomes very hard to find, extract, filter or evaluate the relevant information for the users. In this paper,
we have studied the basic concepts of web mining, classification, processes and issues. In addition to this,
this paper also analyzed the web mining research challenges.
International Journal of Engineering Research and Development (IJERD)IJERD Editor
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
In this world of information technology, everyone has the tendency to do business electronically. Today
lot of businesses are happening on World Wide Web (WWW), it is very important for the website owner to
provide a better platform to attract more customers for their site. Providing information in a better way is
the solution to bring more customers or users. Customer is the end-user, who accessing the information
in a way it yields some credit to the web site owners. In this paper we define web mining and present a
method to utilize web mining in a better way to know the users and website behaviour which in turn
enhance the web site information to attract more users. This paper also presents an overview of the
various researches done on pattern extraction, web content mining and how it can be taken as a catalyst
for E-business.
Web Page Recommendation Using Web MiningIJERA Editor
On World Wide Web various kind of content are generated in huge amount, so to give relevant result to user web recommendation become important part of web application. On web different kind of web recommendation are made available to user every day that includes Image, Video, Audio, query suggestion and web page. In this paper we are aiming at providing framework for web page recommendation. 1) First we describe the basics of web mining, types of web mining. 2) Details of each web mining technique.3)We propose the architecture for the personalized web page recommendation.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Data mining refers to the process of analysing the data from different perspectives and summarizing it into useful information.
Data mining software is one of the number of tools used for analysing data. It allows users to analyse from many different dimensions and angles, categorize it, and summarize the relationship identified.
Data mining is about technique for finding and describing Structural Patterns in data.
Data mining is the process of finding correlation or patterns among fields in large relational databases.
The process of extracting valid, previously unknown, comprehensible , and actionable information from large databases and using it to make crucial business decisions.
Web Content Mining Based on Dom Intersection and Visual Features Conceptijceronline
Structured Data extraction from deep Web pages is a challenging task due to the underlying complex structures of such pages. Also website developer generally follows different web page design technique. Data extraction from webpage is highly useful to build our own database from number applications. A large number of techniques have been proposed to address this problem, but all of them have inherent limitations because they present different limitations and constraints for extracting data from such webpages. This paper presents two different approaches to get structured data extraction. The first approach is non-generic solution which is based on template detection using intersection of Document Object Model Tree of various webpages from the same website. This approach is giving better result in terms of efficiency and accurately locating the main data at the particular webpage. The second approach is based on partial tree alignment mechanism based on using important visual features such as length, size, and position of web table available on the webpages. This approach is a generic solution as it does not depend on one particular website and its webpage template. It is perfectly locating the multiple data regions, data records and data items within a given web page. We have compared our work's result with existing mechanism and found our result much better for number webpage
Integrated Web Recommendation Model with Improved Weighted Association Rule M...ijdkp
World Wide Web plays a significant role in human life. It requires a technological improvement to satisfy
the user needs. Web log data is essential for improving the performance of the web. It contains large,
heterogeneous and diverse data. Analyzing g the web log data is a tedious process for Web developers,
Web designers, technologists and end users. In this work, a new weighted association mining algorithm is
developed to identify the best association rules that are useful for web site restructuring and
recommendation that reduces false visit and improve users’ navigation behavior. The algorithm finds the
frequent item set from a large uncertain database. Frequent scanning of database in each time is the
problem with the existing algorithms which leads to complex output set and time consuming process. The
proposed algorithm scans the database only once at the beginning of the process and the generated
frequent item sets, which are stored into the database. The evaluation parameters such as support,
confidence, lift and number of rules are considered to analyze the performance of proposed algorithm and
traditional association mining algorithm. The new algorithm produced best result that helps the developer
to restructure their website in a way to meet the requirements of the end user within short time span.
IRJET-A Survey on Web Personalization of Web Usage MiningIRJET Journal
S.Jagan, Dr.S.P.Rajagopalan "A Survey on Web Personalization of Web Usage Mining", International Research Journal of Engineering and Technology (IRJET),Volume 2,issue-01 Mar-2015. e-ISSN:2395-0056, p-ISSN:2395-0072. www.irjet.net , published by Fast Track Publications
Abstract
Now a day, World Wide Web (www) is a rich and most powerful source of information. Day by day it is becoming more complex and expanding in size to get maximum information details online. However, it is becoming more complex and critical task to retrieve exact information expected by its users. To deal with this problem one more powerful concept is personalization which is becoming more powerful now days. Personalization is a subclass of information filtering system that seek to predict the 'ratings' or 'preferences' that a user would give to an items, they had not yet considered, using a model built from the characteristics of an item (content-based approaches or collaborative filtering approaches). Web mining is an emerging field of data mining used to provide personalization on the web. It consist three major categories i.e. Web Content Mining, Web Usage Mining, and Web Structure Mining. This paper focuses on web usage mining and algorithms used for providing personalization on the web.
There are numerous ways to analyse the web information, generally web substance are housed in
large information sets and basic inquiries are utilized to parse such information sets. As the requests
expanded with time, mining web information amended to meet challenging task in a web analysis.
Machine learning methodologies are the most up to date one to go into these analysis forms. Different
approaches like decision trees, association rules, Meta heuristic and basic learning methods are embraced
for making web data appraisal and mining data from various web instances. This study will highlight these
approaches in perspective of web investigation. One of the prime goals of this exploration is to investigate
more data mining approaches alongside machine learning systems, and to express emerging collaboration
of web analytics with artificial intelligence.
WEB EVOLUTION - THE SHIFT FROM INFORMATION PUBLISHING TO REASONINGijaia
The Web, as communication channel, has had variety of development that allows information to be published and accessed in a scaleable approach. With the revolution of the information, some research studies have conducted to boost the present situation and propose advance version of the Web. Therefore, it is important to look into the new version of the Web in order to improve the way that information is expressed, to make more intelligent choices and to obtain a better meaning of the information over the Web. That is, future web would require specific architecture in order to support the extracting of better
meaning or "reasoning". With Web 1.0 and Web 2.0, the current information over the Web is not understandable for the machines. Understanding is big shift for wide open door for innovatoion and reasoning. In this work, we research the progress of the Web from Web 1.0, Web 2.0, Web 3.0, Web 4.0, to Web 5.0. We are pointing out document types and technologies employed to understand the changes from
Web 1.0 to Web 3.0 and to predicate the future of the Web (Web 4.0 and Web 5.0). Also, we present the current status and concerns about the Web as an information source and communication channel.
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
Web Page Recommendation Using Web MiningIJERA Editor
On World Wide Web various kind of content are generated in huge amount, so to give relevant result to user web recommendation become important part of web application. On web different kind of web recommendation are made available to user every day that includes Image, Video, Audio, query suggestion and web page. In this paper we are aiming at providing framework for web page recommendation. 1) First we describe the basics of web mining, types of web mining. 2) Details of each web mining technique.3)We propose the architecture for the personalized web page recommendation.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Data mining refers to the process of analysing the data from different perspectives and summarizing it into useful information.
Data mining software is one of the number of tools used for analysing data. It allows users to analyse from many different dimensions and angles, categorize it, and summarize the relationship identified.
Data mining is about technique for finding and describing Structural Patterns in data.
Data mining is the process of finding correlation or patterns among fields in large relational databases.
The process of extracting valid, previously unknown, comprehensible , and actionable information from large databases and using it to make crucial business decisions.
Web Content Mining Based on Dom Intersection and Visual Features Conceptijceronline
Structured Data extraction from deep Web pages is a challenging task due to the underlying complex structures of such pages. Also website developer generally follows different web page design technique. Data extraction from webpage is highly useful to build our own database from number applications. A large number of techniques have been proposed to address this problem, but all of them have inherent limitations because they present different limitations and constraints for extracting data from such webpages. This paper presents two different approaches to get structured data extraction. The first approach is non-generic solution which is based on template detection using intersection of Document Object Model Tree of various webpages from the same website. This approach is giving better result in terms of efficiency and accurately locating the main data at the particular webpage. The second approach is based on partial tree alignment mechanism based on using important visual features such as length, size, and position of web table available on the webpages. This approach is a generic solution as it does not depend on one particular website and its webpage template. It is perfectly locating the multiple data regions, data records and data items within a given web page. We have compared our work's result with existing mechanism and found our result much better for number webpage
Integrated Web Recommendation Model with Improved Weighted Association Rule M...ijdkp
World Wide Web plays a significant role in human life. It requires a technological improvement to satisfy
the user needs. Web log data is essential for improving the performance of the web. It contains large,
heterogeneous and diverse data. Analyzing g the web log data is a tedious process for Web developers,
Web designers, technologists and end users. In this work, a new weighted association mining algorithm is
developed to identify the best association rules that are useful for web site restructuring and
recommendation that reduces false visit and improve users’ navigation behavior. The algorithm finds the
frequent item set from a large uncertain database. Frequent scanning of database in each time is the
problem with the existing algorithms which leads to complex output set and time consuming process. The
proposed algorithm scans the database only once at the beginning of the process and the generated
frequent item sets, which are stored into the database. The evaluation parameters such as support,
confidence, lift and number of rules are considered to analyze the performance of proposed algorithm and
traditional association mining algorithm. The new algorithm produced best result that helps the developer
to restructure their website in a way to meet the requirements of the end user within short time span.
IRJET-A Survey on Web Personalization of Web Usage MiningIRJET Journal
S.Jagan, Dr.S.P.Rajagopalan "A Survey on Web Personalization of Web Usage Mining", International Research Journal of Engineering and Technology (IRJET),Volume 2,issue-01 Mar-2015. e-ISSN:2395-0056, p-ISSN:2395-0072. www.irjet.net , published by Fast Track Publications
Abstract
Now a day, World Wide Web (www) is a rich and most powerful source of information. Day by day it is becoming more complex and expanding in size to get maximum information details online. However, it is becoming more complex and critical task to retrieve exact information expected by its users. To deal with this problem one more powerful concept is personalization which is becoming more powerful now days. Personalization is a subclass of information filtering system that seek to predict the 'ratings' or 'preferences' that a user would give to an items, they had not yet considered, using a model built from the characteristics of an item (content-based approaches or collaborative filtering approaches). Web mining is an emerging field of data mining used to provide personalization on the web. It consist three major categories i.e. Web Content Mining, Web Usage Mining, and Web Structure Mining. This paper focuses on web usage mining and algorithms used for providing personalization on the web.
There are numerous ways to analyse the web information, generally web substance are housed in
large information sets and basic inquiries are utilized to parse such information sets. As the requests
expanded with time, mining web information amended to meet challenging task in a web analysis.
Machine learning methodologies are the most up to date one to go into these analysis forms. Different
approaches like decision trees, association rules, Meta heuristic and basic learning methods are embraced
for making web data appraisal and mining data from various web instances. This study will highlight these
approaches in perspective of web investigation. One of the prime goals of this exploration is to investigate
more data mining approaches alongside machine learning systems, and to express emerging collaboration
of web analytics with artificial intelligence.
WEB EVOLUTION - THE SHIFT FROM INFORMATION PUBLISHING TO REASONINGijaia
The Web, as communication channel, has had variety of development that allows information to be published and accessed in a scaleable approach. With the revolution of the information, some research studies have conducted to boost the present situation and propose advance version of the Web. Therefore, it is important to look into the new version of the Web in order to improve the way that information is expressed, to make more intelligent choices and to obtain a better meaning of the information over the Web. That is, future web would require specific architecture in order to support the extracting of better
meaning or "reasoning". With Web 1.0 and Web 2.0, the current information over the Web is not understandable for the machines. Understanding is big shift for wide open door for innovatoion and reasoning. In this work, we research the progress of the Web from Web 1.0, Web 2.0, Web 3.0, Web 4.0, to Web 5.0. We are pointing out document types and technologies employed to understand the changes from
Web 1.0 to Web 3.0 and to predicate the future of the Web (Web 4.0 and Web 5.0). Also, we present the current status and concerns about the Web as an information source and communication channel.
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
Web is a collection of inter-related files on one or more web servers while web mining means extracting
valuable information from web databases. Web mining is one of the data mining domains where data
mining techniques are used for extracting information from the web servers. The web data includes web
pages, web links, objects on the web and web logs. Web mining is used to understand the customer
behaviour, evaluate a particular website based on the information which is stored in web log files. Web
mining is evaluated by using data mining techniques, namely classification, clustering, and association
rules. It has some beneficial areas or applications such as Electronic commerce, E-learning, Egovernment, E-policies, E-democracy, Electronic business, security, crime investigation and digital library.
Retrieving the required web page from the web efficiently and effectively becomes a challenging task
because web is made up of unstructured data, which delivers the large amount of information and increase
the complexity of dealing information from different web service providers. The collection of information
becomes very hard to find, extract, filter or evaluate the relevant information for the users. In this paper,
we have studied the basic concepts of web mining, classification, processes and issues. In addition to this,
this paper also analyzed the web mining research challenges.
Web is a collection of inter-related files on one or more web servers while web mining means extracting
valuable information from web databases. Web mining is one of the data mining domains where data
mining techniques are used for extracting information from the web servers. The web data includes web
pages, web links, objects on the web and web logs. Web mining is used to understand the customer
behaviour, evaluate a particular website based on the information which is stored in web log files. Web
mining is evaluated by using data mining techniques, namely classification, clustering, and association
rules. It has some beneficial areas or applications such as Electronic commerce, E-learning, Egovernment, E-policies, E-democracy, Electronic business, security, crime investigation and digital library.
Retrieving the required web page from the web efficiently and effectively becomes a challenging task
because web is made up of unstructured data, which delivers the large amount of information and increase
the complexity of dealing information from different web service providers. The collection of information
becomes very hard to find, extract, filter or evaluate the relevant information for the users. In this paper,
we have studied the basic concepts of web mining, classification, processes and issues. In addition to this,
this paper also analyzed the web mining research challenges.
Web is a collection of inter-related files on one or more web servers while web mining means extracting
valuable information from web databases. Web mining is one of the data mining domains where data
mining techniques are used for extracting information from the web servers. The web data includes web
pages, web links, objects on the web and web logs. Web mining is used to understand the customer
behaviour, evaluate a particular website based on the information which is stored in web log files. Web
mining is evaluated by using data mining techniques, namely classification, clustering, and association
rules. It has some beneficial areas or applications such as Electronic commerce, E-learning, Egovernment, E-policies, E-democracy, Electronic business, security, crime investigation and digital library.
Retrieving the required web page from the web efficiently and effectively becomes a challenging task
because web is made up of unstructured data, which delivers the large amount of information and increase
the complexity of dealing information from different web service providers. The collection of information
becomes very hard to find, extract, filter or evaluate the relevant information for the users. In this paper,
we have studied the basic concepts of web mining, classification, processes and issues. In addition to this,
this paper also analyzed the web mining research challenges.
Web is a collection of inter-related files on one or more web servers while web mining means extracting valuable information from web databases. Web mining is one of the data mining domains where data mining techniques are used for extracting information from the web servers. The web data includes web
pages, web links, objects on the web and web logs. Web mining is used to understand the customer behaviour, evaluate a particular website based on the information which is stored in web log files. Web mining is evaluated by using data mining techniques, namely classification, clustering, and association
rules. It has some beneficial areas or applications such as Electronic commerce, E-learning, Egovernment, E-policies, E-democracy, Electronic business, security, crime investigation and digital library. Retrieving the required web page from the web efficiently and effectively becomes a challenging task
because web is made up of unstructured data, which delivers the large amount of information and increase the complexity of dealing information from different web service providers. The collection of information becomes very hard to find, extract, filter or evaluate the relevant information for the users. In this paper,
we have studied the basic concepts of web mining, classification, processes and issues. In addition to this,
this paper also analyzed the web mining research challenges.
Web is a collection of inter-related files on one or more web servers while web mining means extracting
valuable information from web databases. Web mining is one of the data mining domains where data
mining techniques are used for extracting information from the web servers. The web data includes web
pages, web links, objects on the web and web logs. Web mining is used to understand the customer
behaviour, evaluate a particular website based on the information which is stored in web log files. Web
mining is evaluated by using data mining techniques, namely classification, clustering, and association
rules. It has some beneficial areas or applications such as Electronic commerce, E-learning, Egovernment, E-policies, E-democracy, Electronic business, security, crime investigation and digital library.
Retrieving the required web page from the web efficiently and effectively becomes a challenging task
because web is made up of unstructured data, which delivers the large amount of information and increase
the complexity of dealing information from different web service providers. The collection of information
becomes very hard to find, extract, filter or evaluate the relevant information for the users. In this paper,
we have studied the basic concepts of web mining, classification, processes and issues. In addition to this,
this paper also analyzed the web mining research challenges.
Web is a collection of inter-related files on one or more web servers while web mining means extracting valuable information from web databases. Web mining is one of the data mining domains where data mining techniques are used for extracting information from the web servers. The web data includes web
pages, web links, objects on the web and web logs. Web mining is used to understand the customer behaviour, evaluate a particular website based on the information which is stored in web log files. Web mining is evaluated by using data mining techniques, namely classification, clustering, and association
rules. It has some beneficial areas or applications such as Electronic commerce, E-learning, Egovernment, E-policies, E-democracy, Electronic business, security, crime investigation and digital library. Retrieving the required web page from the web efficiently and effectively becomes a challenging task
because web is made up of unstructured data, which delivers the large amount of information and increase the complexity of dealing information from different web service providers. The collection of information becomes very hard to find, extract, filter or evaluate the relevant information for the users. In this paper,
we have studied the basic concepts of web mining, classification, processes and issues. In addition to this,
this paper also analyzed the web mining research challenges.
ANALYTICAL IMPLEMENTATION OF WEB STRUCTURE MINING USING DATA ANALYSIS IN ONLI...IAEME Publication
In today ’s global business, the web has been the most important means of communication. Clients and customers may find their products online, which is a benefit of doing business online. Web mining is the process of using data mining tools to analyse and extract the information from a Web pages and applications autonomously. Many firms use web structure mining to generate suitable predictions and judgments for business growth, productivity, manufacturing techniques, and more utilizing data mining business strategies. In the online booking domain, optimum web data mining analysis of web structure is a crucial component that gives a systematic manner of new application towards real-time data with various levels of implications. Web structure mining emphases on the construction of the web's hyperlinks. Linkage administration that is done correctly can lead to future connections, which can therefore increase the prediction performance of learnt models. A increased interest in Web mining, structural analysis research has expanded, resulting in a new research area that sits at the crossroads of work in the network analysis, hyperlink and the web mining, structural training, and empirical software design techniques, as well as graph mining. Web structure mining is the development of determining structure data from the web. The proposed WSM approach is a system of finding the structure of data stored over the Web. Web structure mining can encourage the clients to recover the significant records by breaking down the connection situated structure of Web content. Web structure mining has been one of the most important resources for information extraction and the knowledge discovery as the amount of data available online has increased.
Web personalization using clustering of web usage dataijfcstjournal
The exponential growth in the number and the complexity of information resources and services on the Web
has made log data an indispensable resource to characterize the users for Web-based environment. It
creates information of related web data in the form of hierarchy structure through approximation. This
hierarchy structure can be used as the input for a variety of data mining tasks such as clustering,
association rule mining, sequence mining etc.
In this paper, we present an approach for personalizing web user environment dynamically when he
interacting with web by clustering of web usage data using concept hierarchy. The system is inferred from
the web server’s access logs by means of data and web usage mining techniques to extract the information
about users. The extracted knowledge is used for the purpose of offering a personalized view of the
services to users.
Abstract: In many fields, such as industry, commerce, government, and education, knowledge discovery and data
mining can be immensely valuable to the subject of Artificial Intelligence. Because of the recent increase in
demand for KDD techniques, such as those used in machine learning, databases, statistics, knowledge acquisition,
data visualisation, and high performance computing, knowledge discovery and data mining have grown in
importance. By employing standard formulas for computational correlations, we hope to create an integrated
technique that can be used to filter web world social information and find parallels between similar tastes of
diverse user information in a variety of settings
A Study of Pattern Analysis Techniques of Web Usageijbuiiir1
Web mining is the most important application of data mining techniques to extract knowledge from web data including web document, hyperlinks between documents, usage logs of web sites etc. Web mining has been explored to a vast degree and different techniques have been proposed for a huge variety of applications that includes search engine enhancement, optimization of web services, Business Intelligence, B2B and B2C business etc. Most research on web mining has been from a �process-centric� point of view which defined web mining as a sequence of tasks. In this paper, we highlight the significance of studying the evolving nature of the web pattern analysis (WPA). Web usage mining is used to discover interesting user navigation patterns and can be applied to many real-world problems, such as improving web sites/pages. A Web usage mining system performs five major tasks: i) data collection ii) information filtering iii) pattern discovery iv) pattern analysis and visualization techniques, and v) Knowledge Query Mechanism (KQM). Each task is explained in detail and its related technologies are introduced. The web mining research is a converging research area from several research communities, such as database system, information retrieval, information extraction and artificial intelligence. In this paper we implement how web usage mining techniques can be applied for the customization i.e. web visualization
Comparable Analysis of Web Mining Categoriestheijes
Web Data Mining is the current field of analysis which is a combination of two research area known as Data Mining and World Wide Web. Web Data Mining research associates with various research diversities like Database, Artificial Intelligence and Information redeem. The mining techniques are categorized into various categories namely Web Content Mining, Web Structure Mining and Web Usage Mining. In this work, analysis of mining techniques are done. From the analysis it has been concluded that Web Content Mining has unstructured or semi- structure view of data whereas Web Structure Mining have linked structure and Web Usage Mining mainly includes interaction.
AN INTELLIGENT OPTIMAL GENETIC MODEL TO INVESTIGATE THE USER USAGE BEHAVIOUR ...ijdkp
The unexpected wide spread use of WWW and dynamically increasing nature of the web creates new
challenges in the web mining since the data in the web inherently unlabelled, incomplete, non linear, and
heterogeneous. The investigation of user usage behaviour on WWW is real time problem which involves
multiple conflicting measures of performance. These measures make not only computational intensive but
also needs to the possibility of be unable to find the exact solution. Unfortunately, the conventional methods
are limited to optimization problems due to the absence of semantic certainty and presence of human
intervention. In handling such data and overcome the limitations of conventional methodologies it is
necessary to use a soft computing model that can work intelligently to attain optimal solution.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
Explore the innovative world of trenchless pipe repair with our comprehensive guide, "The Benefits and Techniques of Trenchless Pipe Repair." This document delves into the modern methods of repairing underground pipes without the need for extensive excavation, highlighting the numerous advantages and the latest techniques used in the industry.
Learn about the cost savings, reduced environmental impact, and minimal disruption associated with trenchless technology. Discover detailed explanations of popular techniques such as pipe bursting, cured-in-place pipe (CIPP) lining, and directional drilling. Understand how these methods can be applied to various types of infrastructure, from residential plumbing to large-scale municipal systems.
Ideal for homeowners, contractors, engineers, and anyone interested in modern plumbing solutions, this guide provides valuable insights into why trenchless pipe repair is becoming the preferred choice for pipe rehabilitation. Stay informed about the latest advancements and best practices in the field.
About
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Technical Specifications
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
Key Features
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface
• Compatible with MAFI CCR system
• Copatiable with IDM8000 CCR
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
Application
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Student information management system project report ii.pdfKamal Acharya
Our project explains about the student management. This project mainly explains the various actions related to student details. This project shows some ease in adding, editing and deleting the student details. It also provides a less time consuming process for viewing, adding, editing and deleting the marks of the students.
Saudi Arabia stands as a titan in the global energy landscape, renowned for its abundant oil and gas resources. It's the largest exporter of petroleum and holds some of the world's most significant reserves. Let's delve into the top 10 oil and gas projects shaping Saudi Arabia's energy future in 2024.
Final project report on grocery store management system..pdfKamal Acharya
In today’s fast-changing business environment, it’s extremely important to be able to respond to client needs in the most effective and timely manner. If your customers wish to see your business online and have instant access to your products or services.
Online Grocery Store is an e-commerce website, which retails various grocery products. This project allows viewing various products available enables registered users to purchase desired products instantly using Paytm, UPI payment processor (Instant Pay) and also can place order by using Cash on Delivery (Pay Later) option. This project provides an easy access to Administrators and Managers to view orders placed using Pay Later and Instant Pay options.
In order to develop an e-commerce website, a number of Technologies must be studied and understood. These include multi-tiered architecture, server and client-side scripting techniques, implementation technologies, programming language (such as PHP, HTML, CSS, JavaScript) and MySQL relational databases. This is a project with the objective to develop a basic website where a consumer is provided with a shopping cart website and also to know about the technologies used to develop such a website.
This document will discuss each of the underlying technologies to create and implement an e- commerce website.
Final project report on grocery store management system..pdf
C03406021027
1. The International Journal Of Engineering And Science (IJES)
|| Volume || 3 || Issue || 4 || Pages || 21-27 || 2014 ||
ISSN (e): 2319 – 1813 ISSN (p): 2319 – 1805
www.theijes.com The IJES Page 21
Data Harvesting through Web Mining: A Survey
Prakul Gupta1
Amit Sharma2
Dr. Sunil Kr Singh3
1, 2,
UG research Scholar, Department of CSE, Bharati Vidyapeeth College of Engineering, New Delhi, India
3,
Professor, Department of CSE, Bharati Vidyapeeth College of Engineering, New Delhi, India
---------------------------------------------------------ABSTRACT-------------------------------------------------
Web mining is one of the fastest growing technology. Experts believe that it will aid business houses in making
better decisions. However, even after an extensive research in this field, there is an uncertainty regarding the
usage of this term and it is often confused with Data mining. In this paper, we will be focussing on shedding
light on such doubts and pointing out the similarities and differences between the two synonymously used words
"Data Mining" and "Web Mining". We'll be addressing the sundry categories of Web Mining and its pros and
cons. We'll also compare the latest tools available in the market which perform Web mining. Finally we'll
delineate a strategy, for beginners, to develop a web mining tool which will help them in understanding the
framework of Web mining.
Keywords: Data mining, Web Content Mining, Web mining, Web Structure Mining, Web Usage Mining.
---------------------------------------------------------------------------------------------------------------------------------------
Date of Submission: 18 April 2014 Date of Publication: 05 May 2014
---------------------------------------------------------------------------------------------------------------------------------------
I. INTRODUCTION
World Wide Web, being the largest repository of information, influences the everyday life of most of
the people as we cannot only find the required information but also can easily share our knowledge and
information with others. In just over two decades, the Web has become a virtual society (a fundamental
research, marketing and communication vehicle) from a university curiosity. Due to this wide availability of
huge amounts of information and the imminent need for turning it into meaningful information [1], we need to
use Web mining.
Maintaining the quality and the accuracy of the data is a critical task and in an ever expanding universe
of mammoth data there would be a high demand of dedicated data harvesting and managing tools to build an
advanced analysis. To extract mammoth data from the internet, Web mining is certainly the technique to be
worked upon. It has allured a great deal of heed in the information industry and in the society as a whole in
recent years, due to the wide availability of bulk of data. This meaningful data gained can be used for
applications ranging from Enterprise Applications like context-aware advertising, database building, business
intelligence, competitive intelligence, comparison shopping etc. to Social Web Applications like Extracting data
from a single and multiple Online Social Web platforms [12].
The key challenges [4,7] we can encounter in the design of a Web Data Harvesting system and its techniques
can be summarized as follows:
Need the help of human experts
Large amount of data should be processed in relatively short time
Solid privacy must be provided by applications dealing with human related data (eg: Applications in the field of
Social Web)
Large training set of Web pages which are manually labelled is required by approaches dependent on Machine
Learning
Evolution of web data source over time is also a hurdle for Web Data Extraction tools which needs to extract data
routinely
Maintaining the integrity of the specifications is another task due to explosive growth of internet in recent time
which has rendered user to get effective information
Time loss experienced by users
Consumption of a lot of System Resources for Knowledge discovery
Caching Schemes fails in certain Conditions.
pre- fetching techniques result in over congestion of Network traffic
2. Data Harvesting through Web Mining: A Survey
www.theijes.com The IJES Page 22
This paper is structured as follows. In section 2, we have given an overview of Data mining and Web
Mining and pointed out differences between the two. In section 3, we provided a classification of Web Mining,
and highlighted the pros and cons of it as well. In section 4, we provided a comparison between the latest
available tools present in the market focusing on the basic features they possess. In section 5, we provided a
strategy to help beginners for developing a web harvesting tool and finally we concluded in section 6.
II. DATA MINING AND WEB MINING
2.1 Data Mining
Mining is a vivid term distinguishing the process that finds a small set of precious nuggets from a great
deal of raw material [3]. A good definition of Data Mining is that in Principles of Data Mining by David Hand :
"Data mining is the analysis of observational data sets to find unsuspected relationships and to summarize the
data in novel ways that are both understandable and useful to the data owner" [11].
Data mining is also called knowledge discovery in databases (KKD) [3]. It is the process of identifying
useful patterns and gathering of data from numerous data sources like disparate databases, texts, images, the
Web, .etc into a single database from which it can be re-published in a unified manner [6]. The patterns must be
valid, understandable and potentially useful. Data mining is a multi-disciplinary field involving machine
learning, statistics, databases, artificial intelligence, information retrieval [2,8], induction, neural networks and
visualization.
2.2 Web Mining
Web mining is an application of data mining techniques to explore exciting and potentially useful
information from Web data. It is generally expected that either the hyperlink structure of the Web or the Web
log data or both have been used in the mining process. We can also state Web mining as the discovery and
analysis of meaningful information from Web pages and services. This describes the automatic search of
information resources available on-line, i.e. Web content mining, and the discovery of user access patterns from
Web servers, i.e., Web usage mining [3].
With the remarkable growth of the Web, there is an explosive increase in the amount of data and information
published in various Web pages. The research in Web mining strives for new techniques to effectively extract and mine
meaningful information from these Web pages [8]. Due to the diversity and structure of Web data, automated realization of
targeted information is a tough task. The different Web mining techniques could be used to solve the information overload
problems, like Finding relevant information or creating new knowledge out of the information available on the Web or
Personalization of the information or Learning about consumers or individual users, directly or indirectly. By the direct
approach we mean that the application of the Web mining techniques directly addresses the above problems. However, we
do not claim that Web mining techniques are the only tools to solve those problems. Other techniques and works from
different research areas, such as database (DB), information retrieval (IR), natural language processing (NLP), and the Web
document community, could also be used [8]. In Fig.1 we have summarised the differences between the two.
Fig.1: data mining vs. web mining
3. Data Harvesting through Web Mining: A Survey
www.theijes.com The IJES Page 23
III. 3. Web Mining Taxonomy
Fig.2: web mining taxonomy
The web mining techniques can broadly be classified into three categories and are represented in Fig.2 and
detailed differences amongst them in Fig.3, which are namely:
Web Content Mining
Web Structure Mining
Web Usage Mining
3.1 Web Content Mining
It deals with extracting valuable information from Web page contents which is well beyond using
keywords in a search engine. In contrast to Web usage mining and Web structure mining, Web content mining
mainly focuses on the Web Page content rather than the links. Web content is a very rich information resource
consisting of many types of information, for example unstructured free text, images, audio, video and metadata
as well as hyperlinks. The content of Web pages includes no machine readable semantic information. Search
engines, subject directories, intelligent agents, cluster analysis, and portals are employed to find what a user
might be looking for. It has been suggested that users should be able to pose more sophisticated queries than just
specifying the keywords.
3.2 Web Structure Mining
It deals with discovering and modelling the link structure of the Web. Work has been carried out to
model the Web based on the topology of the hyperlinks. This can assist in discovering resemblance between
sites or in exploring important sites for a specific topic or discipline or in exploring Web communities.
3.3 Web Usage Mining
It deals with understanding user behaviour in interacting with the Web or with a Web site. One of the
objective is to obtain information that may assist Web site reorganization or assist site adaption to better suit the
user [11]. The mined data often contains data logs of users' interactions with Web. The logs include the Web
server logs, proxy server logs, and browser logs. The logs include information about the referring pages, user
identification, time a user spends at a site and the sequence of pages visited. Information is also collected via
cookie files. While Web structure mining shows that page X has a link to page Y, Web usage mining shows who
or how many people took that link, which site they came from and where they went when they left page Y.
3.4 Pros and cons of Web Mining
Pros
Enables e-commerce for personalized marketing which results in higher trade volumes
Classifies threats and fight terrorism
Identifies criminal activities
Establishes better customer relationship by responding to customer needs faster and fulfilling their
requirements efficiently
Profitability can be increased by target pricing which would be based on different profiles created after
mining
4. Data Harvesting through Web Mining: A Survey
www.theijes.com The IJES Page 24
Cons
Invasion of privacy: Issues related to data of personal nature
De-individualizing users: Harvesting tools judge users by their mouse clicks. De-individualization
means a tendency of judging and treating people on the basis of group characteristics instead of on their
own individual characteristics and merits [10]
Infringement of User's interests: Companies can extract the data for one particular purpose but they
might use the harvested data for a totally different purpose
Trading personal data: Since, there is no law which can prevent website owners to trade data and
hence, there is a growing trend of selling personal data obtained from their sites as a commodity
Fig.3: comparison of web mining categories
IV. COMPARISON OF DIFFERENT HARVESTERS
In the market there are various web harvesting software available and we did the comparison amongst
12 such tools developed by various firms on the basis of their cost and features. The detailed comparison can be
seen from the Fig.4. The features taken into account for comparison are availability of inbuilt scheduler, project
editor, anonymous scraping, multi-threading and different file formats used to export data. This comparison will
help any beginner to look forward to different features which can be included in the build-up of a web harvester.
Fig.4: comparison of web harvesting tools
5. Data Harvesting through Web Mining: A Survey
www.theijes.com The IJES Page 25
V. STRATEGY
In this part, we’ll try to give an overview of what a Web Harvester can do and how it will function. The
description is such that it’ll help any beginner [7] to start with their Web Harvester. The algorithm for our
proposed Web Harvester will be presented in the next paper. The preliminary step to design any software [5] is
to firstly create the overview of the system, Fig.5 represents Level-0 DFD for a Web Harvesting software.
Fig.5: level-0 DFD
Once the overview of the system is understood, one can focus on how to elaborate it. The software (Web HIVE)
is basically divided and implemented , as depicted in Fig.6, according to the following modules:
GUI implementation
Basic Scraping/Harvesting
Addition of extra features
Security Implementation
Fig.6: tool development strategy
5.1 GUI Implementation
A very efficient and user friendly GUI should be implemented in a software. It should be easy to
interact and provide many features to its users. The Web Harvesting software must be a point and click web
harvester (visual web harvester) which lets its user scrape data from the web with ease. It should provide an
inbuilt web browser to navigate to any webpage. It could be configured to extract data from websites with a
mere mouse click thereby minimizing the use of keyboard and other input devices. The user just needs to select
the data to be extracted by pointing the mouse and clicking on it.
5.2 Basic Harvesting
For proper functioning of the tool, a lot of websites of different categories were studied in order to
understand the pattern of data that is present on the web. Based on the study, different patterns, which would be
able to fetch data not only from a few selected websites but from a large pool of various categories of websites,
were used to scrape the data. So the algorithm used to harvest data should revolve around some active element
that the user will click and based on which similar data from the current page and other pages can be extracted
[9]. The main task in this module was to describe how patterns can be used to extract meaningful data. To
harvest data the user need to be in Config mode which provides the user to highlight the data items which are to
be captured. This mode also displays a Capture Window when data elements, which are to be harvested ,are
clicked. Now the user could select what to extract by choosing the appropriate options, like link, image, url,
html code, regular expressions etc, available. These are some basic harvesting strategies.
6. Data Harvesting through Web Mining: A Survey
www.theijes.com The IJES Page 26
5.3 Addition of extra features
Apart from providing basic scraping facilities, the main task is to scrape data across multiple pages as
data displayed by websites spans over multiple pages and hence, an extra feature for this should be provided.
Also a facility to range the number of pages from where data has to be scraped should be provided. In addition
to this, if a user wishes to follow certain link, i.e. if the user wants to scrap data from the links which are similar
to the selected followed link, and then scrape data, such feature should also be provided. The scrap data can be
saved for scheduling purpose. The harvested data can be exported in .csv and .txt format. One more convenient
feature can be added which allows the user to enter the changing field in a url manually, so that the data can be
extracted until the number of pages specified by the user is reached. Also, if the user has a list of links (all
belonging to the same domain, which shares the same page layout) , then the user can be provided with another
special feature to include all those links using a single configuration.
5.4 Security Implementation
In order to maintain a level of anonymity while extracting data from websites, there should be an option which
allow the user to pause the miner periodically while harvesting data. This prevents the harvester from making data requests
continuously(long-time) to the website, resulting in minimization of the chances of the user’s IP from being blocked by the
website. Also, an option should be there to prevent the data loss. Using this option the miner could automatically export the
extracted data to a file on the user’s computer periodically. This option is an optimum way to prevent loss of data due to
unexpected problems during over long mining sessions.
Fig.7: use case diagram
VI. CONCLUSION
As the usage of Web continues to grow, so does the opportunity to analyze data on the Web and to
extract all possible useful knowledge from it is getting threefold. We hope that this paper proves to be a starting
point for profound discussions not only for beginners but also for developers and researchers.
We have tried to provide a clear cut view of different aspects of Web Mining and pointed out and
cleared basic confusions regarding the usage of the term Web mining and Data Mining. In order to reach out to a
larger audience, we have tried to explain Web Mining in a simple way. The paper also provides basic
information regarding Web Mining through various diagrams and tables and also the potential , this technology
has in future, which is essential for beginners to understand its framework.
We did a comparison between different tools available in the market and provided a strategy for
beginners to develop a tool for harvesting data through Web. The Use Case Diagram is depicted in Fig.7. The
optimized algorithm for the same will be provided by us in our next paper.
7. Data Harvesting through Web Mining: A Survey
www.theijes.com The IJES Page 27
REFERENCES
[1] M. Berthold and D.J. Hand, Intelligent Data Analysis: An Introduction (Springer- Verlag New York, Inc., Secaucus, NJ, USA,
1999).
[2] Sarawagi. Information extraction, Found, Trends databases, 1(3):261–377, 2008 DOI: 10.1561/1500000003.
[3] Jiawei Han and Micheline Kamber, Data mining: concepts and techniques (Morgan Kaufmann Publishers Inc., San Francisco,
CA, USA, 2000).
[4] Ferrara, E., de Meo, P., Fiumara, G., and Baumgartner, R. 2012. Web Data Extraction, Applications and Techniques: A Survey,
arXiv:1207.0246v2 [cs.IR] 7 Mar 2013.
[5] Laender, A. H. F., Ribeiro-Neto, B. A. and da Silva, A. S. 2001. DEByE – Data Extraction by Example. Data and Knowledge
Engineering, 40(2), 121-154.
[6] White Paper on Data Harvesting On-time, accurate and easy delivery of data by Snowflake Software team.
[7] Mrs.Bhanu Bhardwaj, Asst Proff DCE G.Noida, Extracting Data through Webmining, International Journal of Engineering
Research & Technology (IJERT) Vol. 1 Issue 3, May – 2012, ISSN: 2278-0181
[8] R. Kosala, H. Blockeel “Web mining research: A survey,” ACM SIGKDD Explorations, Vol. 2 No. 1, pp. 1-15, June 2000.
[9] Yanhong Zhai and Bing Liu. Structured Data Extraction from theWeb Based on Partial Tree Alignment. IEEE Transactions on
Knowledge and Data Engineering, Vol. 18, No. 12,Dec 2006.
[10] Lita Van Wel and Lambèr Royakkers, Ethical issues in web data mining. Ethics and Information Technology, v.6 n.2, p.129-140,
2004.
[11] Gupta, G.K.: Introduction to Data Mining with Case Studies (Phi Learning, 1st edn. 2008).
[12] S. Catanese, P. De Meo, E. Ferrara, G. Fiumara, and A. Provetti. Crawling facebook for social network analysis purposes. In
Proc. International Conference on Web Intelligence, Mining and Semantics, page 52, Sogndal, Norway, 2011. ACM.