SlideShare a Scribd company logo
1 of 10
Download to read offline
International Journal of Computer-Aided Technologies (IJCAx) Vol.2, No.3, July 2015
DOI:10.5121/ijcax.2015.2305 55
RESEARCH ISSUES IN WEB MINING
Dr.S. Vijiyarani1
and Ms. E. Suganya2
1
Assistant professor, Department of Computer science, School of Computer Science and
Engineering,
Bharathiar University, Coimbatore
2
M.Phil Research Scholar, Department of Computer science, School of Computer
science and
Engineering, Bharathiar University, Coimbatore
ABSTRACT
Web is a collection of inter-related files on one or more web servers while web mining means extracting
valuable information from web databases. Web mining is one of the data mining domains where data
mining techniques are used for extracting information from the web servers. The web data includes web
pages, web links, objects on the web and web logs. Web mining is used to understand the customer
behaviour, evaluate a particular website based on the information which is stored in web log files. Web
mining is evaluated by using data mining techniques, namely classification, clustering, and association
rules. It has some beneficial areas or applications such as Electronic commerce, E-learning, E-
government, E-policies, E-democracy, Electronic business, security, crime investigation and digital library.
Retrieving the required web page from the web efficiently and effectively becomes a challenging task
because web is made up of unstructured data, which delivers the large amount of information and increase
the complexity of dealing information from different web service providers. The collection of information
becomes very hard to find, extract, filter or evaluate the relevant information for the users. In this paper,
we have studied the basic concepts of web mining, classification, processes and issues. In addition to this,
this paper also analyzed the web mining research challenges.
KEYWORDS
Web Mining, Classification, Application, Tools, Algorithms, Research Issues
1. INTRODUCTION
Web mining is the application of data mining technique which is an unstructured or semi-
structured data and it automatically discovers and extracts potentially useful and previously
unknown information or knowledge from the web [23]. The significant web mining applications
are website design, web search, search engines, information retrieval, network management, E-
commerce, business and artificial intelligence, web market places and web communities. Online
business breaks the barrier of time and space as compared to the physical office business. Big
companies around the world are realizing that e-commerce is not just buying and selling over
Internet, rather it improves the efficiency to compete with other giants in the market. This
application includes the temporal issues for the users.
International Journal of Computer-Aided Technologies (IJCAx) Vol.2, No.3, July 2015
56
Web mining has three classifications namely, web content mining, web structure mining and web
usage mining. Each classification is having its own algorithms and tools. Web content mining is
nothing but the discovery of valuable information from web documents and these web documents
may contain text, image, hyperlinks, metadata and structured records. It is used to look at the
information by search engine or web spiders i.e. Google, Yahoo. It is the process of retrieving the
useful information from the web content or web documents. Web structure mining is also a
process of discovering structured information from the websites. The structure of a graph consists
of web pages and hyperlinks where the web pages are considered as nodes and the hyperlinks are
edges and these are connecting between related pages. Web usage mining is also called as web
log mining. It reflects the user’s behaviour which can catch the meaningful patterns from one or
more web localities [9].
Web mining process consists of four important steps, they are, resource finding, data selection
and pre-processing, generalization and analysis [23]. Resource finding is the process which is
used to extract the data either from online or offline text resources. In data selection and pre-
processing step, specific information from retrieved web sources are automatically selected and
pre-processed. During generalization, data mining and machine learning techniques are used to
discover general patterns from individual web sites as well as across multiple sites. Validation
and interpretation of the mined patterns are done in analysis step. [1][17]. Web mining is
classified into three different categories, they are, web content mining, web structure mining and
web usage mining. This is illustrated in Figure 1.
Figure 1. Classification of Web Mining
The remaining section of the paper is organized as follows. Section 2 discusses the research
issues in web mining. Web content mining and its research challenges are given in Section 3.
Section 4 describes web structure mining. Section 5 provides the details about web usage mining.
Conclusions are given in Section 6.
International Journal of Computer-Aided Technologies (IJCAx) Vol.2, No.3, July 2015
57
2. RESEARCH ISSUES IN WEB MINING
The web is highly dynamic; lots of pages are added, updated and removed everyday and it
handles huge set of information hence there is an arrival of many number of problems or issues.
Normally, web data is high dimensional, limited query interface, keyword oriented search and
limited customization to individual users. Due to this, it is very difficult to find the relevant
information from the web which may create new issues. Web mining techniques are
classification, clustering and association rules which are used to understand the customer
behaviour, evaluate a particular website by using traditional data mining parameters. Web mining
process is divided into four steps; they are resource finding, data selection and pre-processing,
generalization and analysis [11] [8]. Web measurement or web analytics are one of the significant
challenges in web mining. The measurement factors are hits, page views, visits or user sessions
and find the unique visitor regularly used to measure the user impact of various proposed
changes. Large institutions and organizations archive usage data from the web sites [10]. The
main problem is that, detecting and/or preventing fraud activities. The web usage mining
algorithms are more efficient and accurate. But there is a challenge that has to be taken into
consideration. Web cleaning is the most important process but data cleaning becomes difficult
when it comes to heterogeneous data [20]. Maintaining accuracy in classifying the data needs to
be concentrated. Although many classification techniques exist the quality of clustering is still a
question to be answered.
2.1 Major issues in Web Mining
• Web data sets can be very large, it takes ten to hundreds of terabytes to store on the
database
• It cannot mine on a single server so it needs large number of server
• Proper organization of hardware and software to mine multi-terabyte data sets
• Limited customization, limited coverage, and limited query interface to individual users
• Automated data cleaning
• Over fitting and Under fitting of data
• Over sampling of data
• Scaling up for high dimensional data
• Mining sequence and time series data
• Difficulty in finding relevant information
• Extracting new knowledge from the web
3. WEB CONTENT MINING
Web content mining data may be structured or unstructured/semi structured even though much of
web is unstructured. It is the process of retrieving the information from the web into more
structured forms and indexing the information to retrieve quickly or finding valuable information
from web content or web documents. Web content mining includes the web documents which
may consist of text, html, multimedia documents i.e., images, audio, video and sound etc. The
search result mining contains the web search results. It may be a structure documents or
unstructured documents.
International Journal of Computer-Aided Technologies (IJCAx) Vol.2, No.3, July 2015
58
Web content mining used many algorithms and tools such as Genetic algorithm, Cluster
Hierarchy Construction Algorithm (CHCA), Correlation algorithm. Web Info Extractor (WIE),
Mozenda, screen-scrapper, ontology based tools; web content extractor and automation anywhere
are content mining tools. Cloud users require to extract the information from the cloud provided
by web servers can make use of the web mining. For instance, Web communities can be
maintained the information such as facebook. That is the users of same field of interest can be
grouped and they can communicate through the network. Digital library performs automated
citation indexing using web mining techniques. E-services include e-banking, search engines, on-
line auctions, on-line knowledge management, social networking, e-learning, blog analysis, and
personalization and recommendation systems. This can be analyzed for the customers and enable
provision to the customers based on their recommendations [18]. It has two approaches; they are
(i) Agent based and (ii) Database Approach. Figure 2 gives the web content mining approaches.
Figure2. Web Content Mining approaches
(i) Agent Based Approach
Agent based approach focuses on searching relevant information from the World Wide
Web. Three types of agents they are
(i) Intelligent search agents – Automatically searches for information along with a particular
query
(ii) Information filtering/categorizing agents - Filters the data
(ii) Personalized web agents – Discovers the documents those are related to the user
profiles
(iii)Database Approach
Database approach consists of databases which contain attributes, tables and schema with
defined domains. It focused on techniques for organizing the semi structured data on the web into
more collections of resources, and using standard database querying mechanism and data mining
techniques to analyze it, for example multilevel database and web querying system [5].
International Journal of Computer-Aided Technologies (IJCAx) Vol.2, No.3, July 2015
59
Web content mining has the other approaches to mine the data. These are unstructured text data
mining, structure mining, and semi-structure text mining and multimedia data mining [16].
3.1 RESEARCH ISSUES ON WEB CONTENT MINING
Web content mining has number of research issues because it can extract the information from
the web search engines.
• Data / Information Extraction concentrate on extraction of structured data from web
pages such as products and search results.
• Web information integration and schema matching. The web contains large amount of
data, each website accept similar information in a different way. Similar data discovery is
an important problem with lots of realistic applications.
• Opinion extraction from online sources i.e. customer makes sure of products, forums,
blogs and chat rooms. Mining opinions are of big consequence for marketing intelligence
and product benchmarking.
• Automatically segmenting web pages and detecting noise is an interesting problem in
web application. It could not have advertisements, navigation links and copyrights
notices. Hence, extracting the main content of the web page is important problem in web
application [19].
4. WEB STRUCTURE MINING
Web structure mining is the study of data interconnected to the structure of a particular website. It
consists of web graph which contains the web pages or web documents as nodes and hyperlinks
as edges those are connecting between two related pages [7]. Figure 3 represents the web graph
structure.
Figure3. Web Graph Structure
Web structure is useful source for extracting information. Web structure is to extract some
interesting web graph patterns like co-citation, social choice, complete bipartite graphs, etc [1]. It
classifies the web page on various topics and deciding which web page is to be added into the
collection of web pages. Web structure mining can be performed either at intra-page level or
inter-page level. A hyperlink that connects to a different part of the same page is called intra-page
hyperlink. It is a document structure level [22].
International Journal of Computer-Aided Technologies (IJCAx) Vol.2, No.3, July 2015
60
A hyperlink that connects two different pages are called inter-page hyperlink which is structure
level [12]. Web page is organized in tree structure format based on HTML tags. Here, the
documents are extracted automatically by the Document Object Model (DOM). The main reason
for developing link mining is to understand the social organization of the web. The research of
structure analysis is called Link mining [14] which is located in the connection of work in link
analysis, hypertext and web mining, relational learning, inductive logic programming and graph
mining. Some of the important tasks of link mining are link based classification, link based
cluster analysis, link type, link strength and link cardinality. The research of the hyperlink level is
also called hyperlink analysis [22] which can be used to retrieve useful information from the web
[13].
Web structure mining is used in search engines such as Google, Yahoo, etc. HITS algorithm was
used in clever search engine by IBM and the page rank algorithm is used by Google [11].
Algorithms of web structure mining are HITS (Hypertext Induced Topic Search) algorithm, Max
flow- Min cut algorithm, ECLAT algorithm, and Page rank algorithm. Page rank algorithm can
be divided into two types. One is weighted page rank algorithm and another one is Topic
sensitive page rank algorithm.
4.1 RESEARCH ISSUES ON WEB STRUCTURE MINING
Web structure mining has two issues due to its huge amount of data.
• Reducing irrelevant search results. Relevance of search information becomes
unorganized due to the problem search engines often only tolerate for low precision
criteria.
• Indexing information on the web [7]. This causes low amount of recall with content
mining.
5. WEB USAGE MINING
Web usage mining is also called as web log mining which is used to analyze the behaviour of
online users [2]. It fed into two types of tracking; one is general access tracking and another one
is customize usage tracking [3]. The general access tracking is used to predict the customer
behaviour on the web and it identifies the user while the user interacts with the web. It can store
the data automatically when the web server log and application log [15]. The web log is located in
three different locations they are web server log, web proxy server and client browser and it
contains only plain text file (.txt). The large amounts of irrelevant data are available in the web
log file because it contains noisy data, large amount of incomplete, eroded and unnecessary
information [6]. Web server log files are used to identify the errors and failed requests were given
by the web master and the system administrator. Web usage mining is to extract the data which
are stored in server access logs, referrer logs, agent logs and error logs.
Web usage mining generally uses basic data mining algorithms such as association rule mining,
sequential rule mining, clustering, and classification. It has several tools to analyze the behaviour
of the user. They are KOINOTITES, web SIFT, web usage miner, INSITE, speed tracer,
Archcollect, i-Miner, AWUSA, i-JADE web miner, Web Quilt, STRATDYN, SEWeP, webTool,
MiDAS, web mate, WebLog Miner, DB2Intelligent miner of Data, Poly Analyst version 6.0,
Clemetine, WEBMINER, WEBVIZ. Figure 4 shows the different web server logs [1].
International Journal of Computer-Aided Technologies (IJCAx) Vol.2, No.3, July 2015
61
Figure4: Web Server Logs
5.1 WEB SERVER LOGS
5.1.1 Access log
Access log is used to capture the information about the user and it has many numbers of
attributes. It will record each click event, hits and access of the user. It is one of the web server
logs [16].
5.1.2 Agent log
Agent log is used to record the details about online user behaviour, user’s browser, browser’s
version and operating system. It is a standard log file while comparing the access log.
5.1.3 Error log
When user click on a particular link and the browser does not display the particular page or
website then the user receives error 404 not found.
5.1.4 Referrer log
Referrer log is used to store the information of the URLs of web pages on other sites that link to
web pages. That is, if a user gets to one of the server‘s pages by clicking on a link from another
site, the URL of that site will appear in this log [21].
5.2 PROCESS OF WEB USAGE MINING
Web usage mining process is generally divided into three tasks:
5.2.1 Data pre-processing
Web log data pre-processing is nothing but, to identify users, sessions, page views and so on. In
order to improve the efficiency and scalability many steps are required, these are, data fusion,
International Journal of Computer-Aided Technologies (IJCAx) Vol.2, No.3, July 2015
62
data cleaning, user identification by IP address, authentication data, cookies, client information
and site topology, session identification, formatting, and path completion [6].
5.2.2 Pattern discovery
The data mining techniques and algorithms are used to perform in the pattern discovery by using
clustering, association rules and sequential analysis. The association technique is mostly used in
pattern discovery for detection of relation between visited pages by online users. It is used to
extract patterns of usage from web data [4]. The extract pattern can be stand for in many ways
such as graphs, charts, tables and forms.
Figure5. Process of web usage mining
5.2.3 Pattern analysis
The last process of the web usage mining is pattern analysis. There are so many techniques are
used for pattern analysis such as visualization technique, OLAP technique, data and knowledge
querying and usability analysis [4].
5.3 RESEARCH ISSUES ON WEB USAGE MINING
Web usage mining has several issues because it involves number of data mining techniques. The
problems are [11]
• Session identification
• CGI data
• Catching
• Dynamic pages
• Robot detection and filtering
• Transaction identification
International Journal of Computer-Aided Technologies (IJCAx) Vol.2, No.3, July 2015
63
6. CONCLUSION AND FUTURE WORK
This paper has discussed about the research issues and challenges in web mining and also
provided detailed review about the basic concepts of web mining, web content mining, structure
mining, usage mining, tools, algorithms and types. Several open research issues and drawbacks
which are exists in the current techniques are also discussed. This study and review would be
helpful for researchers those who are doing their research in the domain of web mining.
REFERNCES
[1] Joy Shalom Sona, Prof. Asha Ambhaikar” A Reconciling Website System to Enhance Efficiency with
Web Mining Techniques” International Journal Of Scientific & Engineering Research Volume 3,
Issue 2, February-2012 1 ISSN 2229-5518
[2] Aparna Ranade, Abhijit R. Joshi, Ph. D,” Techniques for Understanding User Usage Behavior on the
Internet” International Journal of Computer Applications (0975 – 8887) Volume 92 – No.7, April
2014
[3] Karan Bhalla & Deepak Prasad,” Data Preparation and Pattern Discovery For Web Usage Mining”
[4] Amit Pratap Singh1, Dr. R. C. Jain 2,” A Survey on Different Phases of Web Usage Mining for
Anomaly User Behavior Investigation” International Journal of Emerging Trends & Technology in
Computer Science (IJETTCS)Volume 3, Issue 3, May – June 2014 ISSN 2278-6856
[5] R. Lokeshkumar1, R. Sindhuja2, Dr. P. Sengottuvelan, “A Survey on Pre-processing of Web Log File
in Web Usage Mining to Improve the Quality of Data” International Journal of Emerging Technology
and Advanced Engineering, ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 4, Issue 8,
August 2014
[6] Mitali Srivastava, Rakhi Garg, P. K. Mishra,” Preprocessing Techniques in Web Usage Mining: A
Survey” International Journal of Computer Applications (0975 – 8887) Volume 97– No.18, July 2014
[7] http://www.slideshare.net/akhanna3/discovering-knowledge-using-web-structure-mining-27488978
[8] Ashish Kumar Garg, Mohammad Amir, Jarrar Ahmed, Man Singh, Sham Bansa,” Implementation of
a Search Engine” International Journal of Science and Research (IJSR) ISSN (Online): 2319-7064
[9] C.Gomathi, M. Moorthi,” Web Access Pattern Algorithms in Education Domain” Computer and
information science journal vol. 1, No.4, November 2008
[10] Md. Zahid Hasan, Khawja Jakaria Ahmad Chisty and Nur-E-Zaman Ayshik, “Research Challenges
in Web Data Mining”, International Journal of Computer Science and Telecommunications Volume
3, Issue 7, July 2012
[11] Jaideep Srivastava, “Web Mining: Accomplishments & Future Directions”, University of Minnesota
USA, srivasta@cs.umn.edu, http://www.cs.umn.edu/faculty/srivasta.html
[12] http://www.slideshare.net/AmirFahmideh/web-mining-structure-mining
[13] http://www.slideshare.net/Tommy96/web-mining-tutorial
[14] http://www.faadooengineers.com/threads/2177-PPT-Link-Mining
[15] Deepti Kapila, Prof. Charanjit Singh, “Survey on Page Ranking Algorithms for Digital Libraries”,
International Journal of Advanced Research in Computer Science and Software Engineering Volume
4, Issue 6, June 2014 ISSN: 2277 128X
[16] Alberto Sillitti, Marco Scotto, Giancarlo Succi, Tullio Vernazza,” News Miner: a Tool for
Information Retrieval”
[17] Sandhya, Mala chaturvedi, “a survey on web mining algorithms”, The International Journal Of
Engineering And Science (IIJES) Volume 2 Issue 3
[18] Ananthi.J,” A Survey Web Content Mining Methods and Applications for Information Extraction
from Online Shopping Sites”, International Journal of Computer Science and Information
Technologies, Vol. 5 (3), 2014
[19] S.Balan, “A Study of Various Techniques of Web Content Mining Research Issues and Tools”,
International journal of innovative research and studies ISSN 2319-9725
International Journal of Computer-Aided Technologies (IJCAx) Vol.2, No.3, July 2015
64
[20] D.Jayalatchumy, Dr. P.Thambidurai, “Web Mining Research Issues and Future Directions – A
Survey”, IOSR Journal of Computer Engineering (IOSR-JCE) e-ISSN: 2278-0661, p- ISSN: 2278-
8727Volume 14, Issue 3
[21] Naga Lakshmi, Raja Sekhara Rao , Sai Satyanarayana Reddy, “An Overview of Preprocessing on
Web Log Data for Web Usage Analysis”, International Journal of Innovative Technology and
Exploring Engineering (IJITEE) ISSN: 2278-3075, Volume-2, Issue-4
[22] Mamta M. Hegde, Prof. M.V.Phatak, “Developing an approach for hyperlink analysis with noise
reduction using Web Structure Mining”, International Journal of Advanced Research in Computer
Engineering & Technology Volume 1, Issue 3, May2012
[23] Mr. Dushyant B.Rathod, Dr.Samrat Khanna, “A Review on Emerging Trends of Web Mining and its
Applications” ISSN: 2321-9939
BIOGRAPHY
Dr. S. Vijayarani has completed MCA, M.Phil and Ph.D in Computer Science. She is
working as Assistant Professor in the School of Computer Science and Engineering,
Bharathiar University, Coimbatore. Her fields of research interest are data mining,
privacy and security issues in data mining and data streams. She has published papers
in the international journals and presented research papers in international and
nationalconferences.
Ms. E. Suganya has completed M.Sc in Computer Science. She is currently pursuing
her M.Phil in Computer Science in the School of Computer Science and Engineering,
Bharathiar University, Coimbatore. Her fields of interest are Data Mining and Web
Mining.

More Related Content

What's hot

IRJET-A Survey on Web Personalization of Web Usage Mining
IRJET-A Survey on Web Personalization of Web Usage MiningIRJET-A Survey on Web Personalization of Web Usage Mining
IRJET-A Survey on Web Personalization of Web Usage MiningIRJET Journal
 
An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...
An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...
An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...IJSRD
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Web Mining Research Issues and Future Directions – A Survey
Web Mining Research Issues and Future Directions – A SurveyWeb Mining Research Issues and Future Directions – A Survey
Web Mining Research Issues and Future Directions – A SurveyIOSR Journals
 
Web Page Recommendation Using Web Mining
Web Page Recommendation Using Web MiningWeb Page Recommendation Using Web Mining
Web Page Recommendation Using Web MiningIJERA Editor
 
STRATEGY AND IMPLEMENTATION OF WEB MINING TOOLS
STRATEGY AND IMPLEMENTATION OF WEB MINING TOOLSSTRATEGY AND IMPLEMENTATION OF WEB MINING TOOLS
STRATEGY AND IMPLEMENTATION OF WEB MINING TOOLSAM Publications
 
International conference On Computer Science And technology
International conference On Computer Science And technologyInternational conference On Computer Science And technology
International conference On Computer Science And technologyanchalsinghdm
 
Comparative Analysis of Collaborative Filtering Technique
Comparative Analysis of Collaborative Filtering TechniqueComparative Analysis of Collaborative Filtering Technique
Comparative Analysis of Collaborative Filtering TechniqueIOSR Journals
 
Iaetsd web personalization a general survey
Iaetsd web personalization a general surveyIaetsd web personalization a general survey
Iaetsd web personalization a general surveyIaetsd Iaetsd
 
IRJET-Model for semantic processing in information retrieval systems
IRJET-Model for semantic processing in information retrieval systemsIRJET-Model for semantic processing in information retrieval systems
IRJET-Model for semantic processing in information retrieval systemsIRJET Journal
 
Study on Web Content Extraction Techniques
Study on Web Content Extraction TechniquesStudy on Web Content Extraction Techniques
Study on Web Content Extraction Techniquesijtsrd
 

What's hot (16)

Web Content Mining
Web Content MiningWeb Content Mining
Web Content Mining
 
01635156
0163515601635156
01635156
 
IRJET-A Survey on Web Personalization of Web Usage Mining
IRJET-A Survey on Web Personalization of Web Usage MiningIRJET-A Survey on Web Personalization of Web Usage Mining
IRJET-A Survey on Web Personalization of Web Usage Mining
 
Paper24
Paper24Paper24
Paper24
 
An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...
An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...
An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Web Mining Research Issues and Future Directions – A Survey
Web Mining Research Issues and Future Directions – A SurveyWeb Mining Research Issues and Future Directions – A Survey
Web Mining Research Issues and Future Directions – A Survey
 
Web Page Recommendation Using Web Mining
Web Page Recommendation Using Web MiningWeb Page Recommendation Using Web Mining
Web Page Recommendation Using Web Mining
 
STRATEGY AND IMPLEMENTATION OF WEB MINING TOOLS
STRATEGY AND IMPLEMENTATION OF WEB MINING TOOLSSTRATEGY AND IMPLEMENTATION OF WEB MINING TOOLS
STRATEGY AND IMPLEMENTATION OF WEB MINING TOOLS
 
International conference On Computer Science And technology
International conference On Computer Science And technologyInternational conference On Computer Science And technology
International conference On Computer Science And technology
 
Comparative Analysis of Collaborative Filtering Technique
Comparative Analysis of Collaborative Filtering TechniqueComparative Analysis of Collaborative Filtering Technique
Comparative Analysis of Collaborative Filtering Technique
 
Web mining
Web miningWeb mining
Web mining
 
COMBI: A NEW COMBINATION OF WEB MINING TECHNIQUES FOR E- LEARNING PERSONALIZA...
COMBI: A NEW COMBINATION OF WEB MINING TECHNIQUES FOR E- LEARNING PERSONALIZA...COMBI: A NEW COMBINATION OF WEB MINING TECHNIQUES FOR E- LEARNING PERSONALIZA...
COMBI: A NEW COMBINATION OF WEB MINING TECHNIQUES FOR E- LEARNING PERSONALIZA...
 
Iaetsd web personalization a general survey
Iaetsd web personalization a general surveyIaetsd web personalization a general survey
Iaetsd web personalization a general survey
 
IRJET-Model for semantic processing in information retrieval systems
IRJET-Model for semantic processing in information retrieval systemsIRJET-Model for semantic processing in information retrieval systems
IRJET-Model for semantic processing in information retrieval systems
 
Study on Web Content Extraction Techniques
Study on Web Content Extraction TechniquesStudy on Web Content Extraction Techniques
Study on Web Content Extraction Techniques
 

Similar to Research Issues in Web Mining

Web Usage Mining: A Survey on User's Navigation Pattern from Web Logs
Web Usage Mining: A Survey on User's Navigation Pattern from Web LogsWeb Usage Mining: A Survey on User's Navigation Pattern from Web Logs
Web Usage Mining: A Survey on User's Navigation Pattern from Web Logsijsrd.com
 
A Study Web Data Mining Challenges And Application For Information Extraction
A Study  Web Data Mining Challenges And Application For Information ExtractionA Study  Web Data Mining Challenges And Application For Information Extraction
A Study Web Data Mining Challenges And Application For Information ExtractionScott Bou
 
Performance of Real Time Web Traffic Analysis Using Feed Forward Neural Netw...
Performance of Real Time Web Traffic Analysis Using Feed  Forward Neural Netw...Performance of Real Time Web Traffic Analysis Using Feed  Forward Neural Netw...
Performance of Real Time Web Traffic Analysis Using Feed Forward Neural Netw...IOSR Journals
 
Web personalization using clustering of web usage data
Web personalization using clustering of web usage dataWeb personalization using clustering of web usage data
Web personalization using clustering of web usage dataijfcstjournal
 
ANALYTICAL IMPLEMENTATION OF WEB STRUCTURE MINING USING DATA ANALYSIS IN ONLI...
ANALYTICAL IMPLEMENTATION OF WEB STRUCTURE MINING USING DATA ANALYSIS IN ONLI...ANALYTICAL IMPLEMENTATION OF WEB STRUCTURE MINING USING DATA ANALYSIS IN ONLI...
ANALYTICAL IMPLEMENTATION OF WEB STRUCTURE MINING USING DATA ANALYSIS IN ONLI...IAEME Publication
 
A Study of Pattern Analysis Techniques of Web Usage
A Study of Pattern Analysis Techniques of Web UsageA Study of Pattern Analysis Techniques of Web Usage
A Study of Pattern Analysis Techniques of Web Usageijbuiiir1
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentIJERD Editor
 
Web Data mining-A Research area in Web usage mining
Web Data mining-A Research area in Web usage miningWeb Data mining-A Research area in Web usage mining
Web Data mining-A Research area in Web usage miningIOSR Journals
 
BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...
BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...
BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...ijdkp
 
Identifying the Number of Visitors to improve Website Usability from Educatio...
Identifying the Number of Visitors to improve Website Usability from Educatio...Identifying the Number of Visitors to improve Website Usability from Educatio...
Identifying the Number of Visitors to improve Website Usability from Educatio...Editor IJCATR
 
Web-Application Framework for E-Business Solution
Web-Application Framework for E-Business SolutionWeb-Application Framework for E-Business Solution
Web-Application Framework for E-Business SolutionIRJET Journal
 
A Study on Web Structure Mining
A Study on Web Structure MiningA Study on Web Structure Mining
A Study on Web Structure MiningIRJET Journal
 

Similar to Research Issues in Web Mining (18)

Pxc3893553
Pxc3893553Pxc3893553
Pxc3893553
 
Web Usage Mining: A Survey on User's Navigation Pattern from Web Logs
Web Usage Mining: A Survey on User's Navigation Pattern from Web LogsWeb Usage Mining: A Survey on User's Navigation Pattern from Web Logs
Web Usage Mining: A Survey on User's Navigation Pattern from Web Logs
 
A Study Web Data Mining Challenges And Application For Information Extraction
A Study  Web Data Mining Challenges And Application For Information ExtractionA Study  Web Data Mining Challenges And Application For Information Extraction
A Study Web Data Mining Challenges And Application For Information Extraction
 
Performance of Real Time Web Traffic Analysis Using Feed Forward Neural Netw...
Performance of Real Time Web Traffic Analysis Using Feed  Forward Neural Netw...Performance of Real Time Web Traffic Analysis Using Feed  Forward Neural Netw...
Performance of Real Time Web Traffic Analysis Using Feed Forward Neural Netw...
 
Web personalization using clustering of web usage data
Web personalization using clustering of web usage dataWeb personalization using clustering of web usage data
Web personalization using clustering of web usage data
 
ANALYTICAL IMPLEMENTATION OF WEB STRUCTURE MINING USING DATA ANALYSIS IN ONLI...
ANALYTICAL IMPLEMENTATION OF WEB STRUCTURE MINING USING DATA ANALYSIS IN ONLI...ANALYTICAL IMPLEMENTATION OF WEB STRUCTURE MINING USING DATA ANALYSIS IN ONLI...
ANALYTICAL IMPLEMENTATION OF WEB STRUCTURE MINING USING DATA ANALYSIS IN ONLI...
 
Aa03401490154
Aa03401490154Aa03401490154
Aa03401490154
 
A Study of Pattern Analysis Techniques of Web Usage
A Study of Pattern Analysis Techniques of Web UsageA Study of Pattern Analysis Techniques of Web Usage
A Study of Pattern Analysis Techniques of Web Usage
 
Webmining ppt
Webmining pptWebmining ppt
Webmining ppt
 
Web mining
Web miningWeb mining
Web mining
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
Web Data mining-A Research area in Web usage mining
Web Data mining-A Research area in Web usage miningWeb Data mining-A Research area in Web usage mining
Web Data mining-A Research area in Web usage mining
 
BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...
BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...
BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...
 
Identifying the Number of Visitors to improve Website Usability from Educatio...
Identifying the Number of Visitors to improve Website Usability from Educatio...Identifying the Number of Visitors to improve Website Usability from Educatio...
Identifying the Number of Visitors to improve Website Usability from Educatio...
 
Web-Application Framework for E-Business Solution
Web-Application Framework for E-Business SolutionWeb-Application Framework for E-Business Solution
Web-Application Framework for E-Business Solution
 
Bb31269380
Bb31269380Bb31269380
Bb31269380
 
H0314450
H0314450H0314450
H0314450
 
A Study on Web Structure Mining
A Study on Web Structure MiningA Study on Web Structure Mining
A Study on Web Structure Mining
 

More from ijcax

NEW ONTOLOGY RETRIEVAL IMAGE METHOD IN 5K COREL IMAGES
NEW ONTOLOGY RETRIEVAL IMAGE METHOD IN 5K COREL IMAGESNEW ONTOLOGY RETRIEVAL IMAGE METHOD IN 5K COREL IMAGES
NEW ONTOLOGY RETRIEVAL IMAGE METHOD IN 5K COREL IMAGESijcax
 
THE STUDY OF CUCKOO OPTIMIZATION ALGORITHM FOR PRODUCTION PLANNING PROBLEM
THE STUDY OF CUCKOO OPTIMIZATION ALGORITHM FOR PRODUCTION PLANNING PROBLEMTHE STUDY OF CUCKOO OPTIMIZATION ALGORITHM FOR PRODUCTION PLANNING PROBLEM
THE STUDY OF CUCKOO OPTIMIZATION ALGORITHM FOR PRODUCTION PLANNING PROBLEMijcax
 
COMPARATIVE ANALYSIS OF ROUTING PROTOCOLS IN MOBILE AD HOC NETWORKS
COMPARATIVE ANALYSIS OF ROUTING PROTOCOLS IN MOBILE AD HOC NETWORKSCOMPARATIVE ANALYSIS OF ROUTING PROTOCOLS IN MOBILE AD HOC NETWORKS
COMPARATIVE ANALYSIS OF ROUTING PROTOCOLS IN MOBILE AD HOC NETWORKSijcax
 
PREDICTING ACADEMIC MAJOR OF STUDENTS USING BAYESIAN NETWORKS TO THE CASE OF ...
PREDICTING ACADEMIC MAJOR OF STUDENTS USING BAYESIAN NETWORKS TO THE CASE OF ...PREDICTING ACADEMIC MAJOR OF STUDENTS USING BAYESIAN NETWORKS TO THE CASE OF ...
PREDICTING ACADEMIC MAJOR OF STUDENTS USING BAYESIAN NETWORKS TO THE CASE OF ...ijcax
 
A Multi Criteria Decision Making Based Approach for Semantic Image Annotation
A Multi Criteria Decision Making Based Approach for Semantic Image Annotation A Multi Criteria Decision Making Based Approach for Semantic Image Annotation
A Multi Criteria Decision Making Based Approach for Semantic Image Annotation ijcax
 
On Fuzzy Soft Multi Set and Its Application in Information Systems
On Fuzzy Soft Multi Set and Its Application in Information Systems On Fuzzy Soft Multi Set and Its Application in Information Systems
On Fuzzy Soft Multi Set and Its Application in Information Systems ijcax
 
RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING ijcax
 
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR ijcax
 
BLIND AID : TRAVEL AID FOR BLIND
BLIND AID : TRAVEL AID FOR BLINDBLIND AID : TRAVEL AID FOR BLIND
BLIND AID : TRAVEL AID FOR BLINDijcax
 
INTELLIGENT AGENT FOR PUBLICATION AND SUBSCRIPTION PATTERN ANALYSIS OF NEWS W...
INTELLIGENT AGENT FOR PUBLICATION AND SUBSCRIPTION PATTERN ANALYSIS OF NEWS W...INTELLIGENT AGENT FOR PUBLICATION AND SUBSCRIPTION PATTERN ANALYSIS OF NEWS W...
INTELLIGENT AGENT FOR PUBLICATION AND SUBSCRIPTION PATTERN ANALYSIS OF NEWS W...ijcax
 
ADVANCED E-VOTING APPLICATION USING ANDROID PLATFORM
ADVANCED E-VOTING APPLICATION USING ANDROID PLATFORMADVANCED E-VOTING APPLICATION USING ANDROID PLATFORM
ADVANCED E-VOTING APPLICATION USING ANDROID PLATFORMijcax
 
TRACE LENGTH CALCULATION ON PCBS
TRACE LENGTH CALCULATION ON PCBSTRACE LENGTH CALCULATION ON PCBS
TRACE LENGTH CALCULATION ON PCBSijcax
 
RESEARCH TRENDS İN EDUCATIONAL TECHNOLOGY İN TURKEY: 2010-2018 YEAR THESIS AN...
RESEARCH TRENDS İN EDUCATIONAL TECHNOLOGY İN TURKEY: 2010-2018 YEAR THESIS AN...RESEARCH TRENDS İN EDUCATIONAL TECHNOLOGY İN TURKEY: 2010-2018 YEAR THESIS AN...
RESEARCH TRENDS İN EDUCATIONAL TECHNOLOGY İN TURKEY: 2010-2018 YEAR THESIS AN...ijcax
 
RESEARCH TRENDS İN EDUCATIONAL TECHNOLOGY İN TURKEY: 2010-2018 YEAR THESIS AN...
RESEARCH TRENDS İN EDUCATIONAL TECHNOLOGY İN TURKEY: 2010-2018 YEAR THESIS AN...RESEARCH TRENDS İN EDUCATIONAL TECHNOLOGY İN TURKEY: 2010-2018 YEAR THESIS AN...
RESEARCH TRENDS İN EDUCATIONAL TECHNOLOGY İN TURKEY: 2010-2018 YEAR THESIS AN...ijcax
 
IMPACT OF APPLYING INTERNATIONAL QUALITY STANDARDS ON MEDICAL EQUIPMENT IN SA...
IMPACT OF APPLYING INTERNATIONAL QUALITY STANDARDS ON MEDICAL EQUIPMENT IN SA...IMPACT OF APPLYING INTERNATIONAL QUALITY STANDARDS ON MEDICAL EQUIPMENT IN SA...
IMPACT OF APPLYING INTERNATIONAL QUALITY STANDARDS ON MEDICAL EQUIPMENT IN SA...ijcax
 
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR ijcax
 
Developing Product Configurator Tool Using CADs’ API with the help of Paramet...
Developing Product Configurator Tool Using CADs’ API with the help of Paramet...Developing Product Configurator Tool Using CADs’ API with the help of Paramet...
Developing Product Configurator Tool Using CADs’ API with the help of Paramet...ijcax
 
DESIGN AND DEVELOPMENT OF CUSTOM CHANGE MANAGEMENT WORKFLOW TEMPLATES AND HAN...
DESIGN AND DEVELOPMENT OF CUSTOM CHANGE MANAGEMENT WORKFLOW TEMPLATES AND HAN...DESIGN AND DEVELOPMENT OF CUSTOM CHANGE MANAGEMENT WORKFLOW TEMPLATES AND HAN...
DESIGN AND DEVELOPMENT OF CUSTOM CHANGE MANAGEMENT WORKFLOW TEMPLATES AND HAN...ijcax
 
BLIND AID : TRAVEL AID FOR BLIND
BLIND AID : TRAVEL AID FOR BLINDBLIND AID : TRAVEL AID FOR BLIND
BLIND AID : TRAVEL AID FOR BLINDijcax
 
TEACHER’S ATTITUDE TOWARDS UTILISING FUTURE GADGETS IN EDUCATION
TEACHER’S ATTITUDE TOWARDS UTILISING FUTURE GADGETS IN EDUCATION TEACHER’S ATTITUDE TOWARDS UTILISING FUTURE GADGETS IN EDUCATION
TEACHER’S ATTITUDE TOWARDS UTILISING FUTURE GADGETS IN EDUCATION ijcax
 

More from ijcax (20)

NEW ONTOLOGY RETRIEVAL IMAGE METHOD IN 5K COREL IMAGES
NEW ONTOLOGY RETRIEVAL IMAGE METHOD IN 5K COREL IMAGESNEW ONTOLOGY RETRIEVAL IMAGE METHOD IN 5K COREL IMAGES
NEW ONTOLOGY RETRIEVAL IMAGE METHOD IN 5K COREL IMAGES
 
THE STUDY OF CUCKOO OPTIMIZATION ALGORITHM FOR PRODUCTION PLANNING PROBLEM
THE STUDY OF CUCKOO OPTIMIZATION ALGORITHM FOR PRODUCTION PLANNING PROBLEMTHE STUDY OF CUCKOO OPTIMIZATION ALGORITHM FOR PRODUCTION PLANNING PROBLEM
THE STUDY OF CUCKOO OPTIMIZATION ALGORITHM FOR PRODUCTION PLANNING PROBLEM
 
COMPARATIVE ANALYSIS OF ROUTING PROTOCOLS IN MOBILE AD HOC NETWORKS
COMPARATIVE ANALYSIS OF ROUTING PROTOCOLS IN MOBILE AD HOC NETWORKSCOMPARATIVE ANALYSIS OF ROUTING PROTOCOLS IN MOBILE AD HOC NETWORKS
COMPARATIVE ANALYSIS OF ROUTING PROTOCOLS IN MOBILE AD HOC NETWORKS
 
PREDICTING ACADEMIC MAJOR OF STUDENTS USING BAYESIAN NETWORKS TO THE CASE OF ...
PREDICTING ACADEMIC MAJOR OF STUDENTS USING BAYESIAN NETWORKS TO THE CASE OF ...PREDICTING ACADEMIC MAJOR OF STUDENTS USING BAYESIAN NETWORKS TO THE CASE OF ...
PREDICTING ACADEMIC MAJOR OF STUDENTS USING BAYESIAN NETWORKS TO THE CASE OF ...
 
A Multi Criteria Decision Making Based Approach for Semantic Image Annotation
A Multi Criteria Decision Making Based Approach for Semantic Image Annotation A Multi Criteria Decision Making Based Approach for Semantic Image Annotation
A Multi Criteria Decision Making Based Approach for Semantic Image Annotation
 
On Fuzzy Soft Multi Set and Its Application in Information Systems
On Fuzzy Soft Multi Set and Its Application in Information Systems On Fuzzy Soft Multi Set and Its Application in Information Systems
On Fuzzy Soft Multi Set and Its Application in Information Systems
 
RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING RESEARCH ISSUES IN WEB MINING
RESEARCH ISSUES IN WEB MINING
 
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
 
BLIND AID : TRAVEL AID FOR BLIND
BLIND AID : TRAVEL AID FOR BLINDBLIND AID : TRAVEL AID FOR BLIND
BLIND AID : TRAVEL AID FOR BLIND
 
INTELLIGENT AGENT FOR PUBLICATION AND SUBSCRIPTION PATTERN ANALYSIS OF NEWS W...
INTELLIGENT AGENT FOR PUBLICATION AND SUBSCRIPTION PATTERN ANALYSIS OF NEWS W...INTELLIGENT AGENT FOR PUBLICATION AND SUBSCRIPTION PATTERN ANALYSIS OF NEWS W...
INTELLIGENT AGENT FOR PUBLICATION AND SUBSCRIPTION PATTERN ANALYSIS OF NEWS W...
 
ADVANCED E-VOTING APPLICATION USING ANDROID PLATFORM
ADVANCED E-VOTING APPLICATION USING ANDROID PLATFORMADVANCED E-VOTING APPLICATION USING ANDROID PLATFORM
ADVANCED E-VOTING APPLICATION USING ANDROID PLATFORM
 
TRACE LENGTH CALCULATION ON PCBS
TRACE LENGTH CALCULATION ON PCBSTRACE LENGTH CALCULATION ON PCBS
TRACE LENGTH CALCULATION ON PCBS
 
RESEARCH TRENDS İN EDUCATIONAL TECHNOLOGY İN TURKEY: 2010-2018 YEAR THESIS AN...
RESEARCH TRENDS İN EDUCATIONAL TECHNOLOGY İN TURKEY: 2010-2018 YEAR THESIS AN...RESEARCH TRENDS İN EDUCATIONAL TECHNOLOGY İN TURKEY: 2010-2018 YEAR THESIS AN...
RESEARCH TRENDS İN EDUCATIONAL TECHNOLOGY İN TURKEY: 2010-2018 YEAR THESIS AN...
 
RESEARCH TRENDS İN EDUCATIONAL TECHNOLOGY İN TURKEY: 2010-2018 YEAR THESIS AN...
RESEARCH TRENDS İN EDUCATIONAL TECHNOLOGY İN TURKEY: 2010-2018 YEAR THESIS AN...RESEARCH TRENDS İN EDUCATIONAL TECHNOLOGY İN TURKEY: 2010-2018 YEAR THESIS AN...
RESEARCH TRENDS İN EDUCATIONAL TECHNOLOGY İN TURKEY: 2010-2018 YEAR THESIS AN...
 
IMPACT OF APPLYING INTERNATIONAL QUALITY STANDARDS ON MEDICAL EQUIPMENT IN SA...
IMPACT OF APPLYING INTERNATIONAL QUALITY STANDARDS ON MEDICAL EQUIPMENT IN SA...IMPACT OF APPLYING INTERNATIONAL QUALITY STANDARDS ON MEDICAL EQUIPMENT IN SA...
IMPACT OF APPLYING INTERNATIONAL QUALITY STANDARDS ON MEDICAL EQUIPMENT IN SA...
 
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
 
Developing Product Configurator Tool Using CADs’ API with the help of Paramet...
Developing Product Configurator Tool Using CADs’ API with the help of Paramet...Developing Product Configurator Tool Using CADs’ API with the help of Paramet...
Developing Product Configurator Tool Using CADs’ API with the help of Paramet...
 
DESIGN AND DEVELOPMENT OF CUSTOM CHANGE MANAGEMENT WORKFLOW TEMPLATES AND HAN...
DESIGN AND DEVELOPMENT OF CUSTOM CHANGE MANAGEMENT WORKFLOW TEMPLATES AND HAN...DESIGN AND DEVELOPMENT OF CUSTOM CHANGE MANAGEMENT WORKFLOW TEMPLATES AND HAN...
DESIGN AND DEVELOPMENT OF CUSTOM CHANGE MANAGEMENT WORKFLOW TEMPLATES AND HAN...
 
BLIND AID : TRAVEL AID FOR BLIND
BLIND AID : TRAVEL AID FOR BLINDBLIND AID : TRAVEL AID FOR BLIND
BLIND AID : TRAVEL AID FOR BLIND
 
TEACHER’S ATTITUDE TOWARDS UTILISING FUTURE GADGETS IN EDUCATION
TEACHER’S ATTITUDE TOWARDS UTILISING FUTURE GADGETS IN EDUCATION TEACHER’S ATTITUDE TOWARDS UTILISING FUTURE GADGETS IN EDUCATION
TEACHER’S ATTITUDE TOWARDS UTILISING FUTURE GADGETS IN EDUCATION
 

Recently uploaded

Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........LeaCamillePacle
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Planning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxPlanning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxLigayaBacuel1
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.arsicmarija21
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 

Recently uploaded (20)

Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
Planning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxPlanning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptx
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 

Research Issues in Web Mining

  • 1. International Journal of Computer-Aided Technologies (IJCAx) Vol.2, No.3, July 2015 DOI:10.5121/ijcax.2015.2305 55 RESEARCH ISSUES IN WEB MINING Dr.S. Vijiyarani1 and Ms. E. Suganya2 1 Assistant professor, Department of Computer science, School of Computer Science and Engineering, Bharathiar University, Coimbatore 2 M.Phil Research Scholar, Department of Computer science, School of Computer science and Engineering, Bharathiar University, Coimbatore ABSTRACT Web is a collection of inter-related files on one or more web servers while web mining means extracting valuable information from web databases. Web mining is one of the data mining domains where data mining techniques are used for extracting information from the web servers. The web data includes web pages, web links, objects on the web and web logs. Web mining is used to understand the customer behaviour, evaluate a particular website based on the information which is stored in web log files. Web mining is evaluated by using data mining techniques, namely classification, clustering, and association rules. It has some beneficial areas or applications such as Electronic commerce, E-learning, E- government, E-policies, E-democracy, Electronic business, security, crime investigation and digital library. Retrieving the required web page from the web efficiently and effectively becomes a challenging task because web is made up of unstructured data, which delivers the large amount of information and increase the complexity of dealing information from different web service providers. The collection of information becomes very hard to find, extract, filter or evaluate the relevant information for the users. In this paper, we have studied the basic concepts of web mining, classification, processes and issues. In addition to this, this paper also analyzed the web mining research challenges. KEYWORDS Web Mining, Classification, Application, Tools, Algorithms, Research Issues 1. INTRODUCTION Web mining is the application of data mining technique which is an unstructured or semi- structured data and it automatically discovers and extracts potentially useful and previously unknown information or knowledge from the web [23]. The significant web mining applications are website design, web search, search engines, information retrieval, network management, E- commerce, business and artificial intelligence, web market places and web communities. Online business breaks the barrier of time and space as compared to the physical office business. Big companies around the world are realizing that e-commerce is not just buying and selling over Internet, rather it improves the efficiency to compete with other giants in the market. This application includes the temporal issues for the users.
  • 2. International Journal of Computer-Aided Technologies (IJCAx) Vol.2, No.3, July 2015 56 Web mining has three classifications namely, web content mining, web structure mining and web usage mining. Each classification is having its own algorithms and tools. Web content mining is nothing but the discovery of valuable information from web documents and these web documents may contain text, image, hyperlinks, metadata and structured records. It is used to look at the information by search engine or web spiders i.e. Google, Yahoo. It is the process of retrieving the useful information from the web content or web documents. Web structure mining is also a process of discovering structured information from the websites. The structure of a graph consists of web pages and hyperlinks where the web pages are considered as nodes and the hyperlinks are edges and these are connecting between related pages. Web usage mining is also called as web log mining. It reflects the user’s behaviour which can catch the meaningful patterns from one or more web localities [9]. Web mining process consists of four important steps, they are, resource finding, data selection and pre-processing, generalization and analysis [23]. Resource finding is the process which is used to extract the data either from online or offline text resources. In data selection and pre- processing step, specific information from retrieved web sources are automatically selected and pre-processed. During generalization, data mining and machine learning techniques are used to discover general patterns from individual web sites as well as across multiple sites. Validation and interpretation of the mined patterns are done in analysis step. [1][17]. Web mining is classified into three different categories, they are, web content mining, web structure mining and web usage mining. This is illustrated in Figure 1. Figure 1. Classification of Web Mining The remaining section of the paper is organized as follows. Section 2 discusses the research issues in web mining. Web content mining and its research challenges are given in Section 3. Section 4 describes web structure mining. Section 5 provides the details about web usage mining. Conclusions are given in Section 6.
  • 3. International Journal of Computer-Aided Technologies (IJCAx) Vol.2, No.3, July 2015 57 2. RESEARCH ISSUES IN WEB MINING The web is highly dynamic; lots of pages are added, updated and removed everyday and it handles huge set of information hence there is an arrival of many number of problems or issues. Normally, web data is high dimensional, limited query interface, keyword oriented search and limited customization to individual users. Due to this, it is very difficult to find the relevant information from the web which may create new issues. Web mining techniques are classification, clustering and association rules which are used to understand the customer behaviour, evaluate a particular website by using traditional data mining parameters. Web mining process is divided into four steps; they are resource finding, data selection and pre-processing, generalization and analysis [11] [8]. Web measurement or web analytics are one of the significant challenges in web mining. The measurement factors are hits, page views, visits or user sessions and find the unique visitor regularly used to measure the user impact of various proposed changes. Large institutions and organizations archive usage data from the web sites [10]. The main problem is that, detecting and/or preventing fraud activities. The web usage mining algorithms are more efficient and accurate. But there is a challenge that has to be taken into consideration. Web cleaning is the most important process but data cleaning becomes difficult when it comes to heterogeneous data [20]. Maintaining accuracy in classifying the data needs to be concentrated. Although many classification techniques exist the quality of clustering is still a question to be answered. 2.1 Major issues in Web Mining • Web data sets can be very large, it takes ten to hundreds of terabytes to store on the database • It cannot mine on a single server so it needs large number of server • Proper organization of hardware and software to mine multi-terabyte data sets • Limited customization, limited coverage, and limited query interface to individual users • Automated data cleaning • Over fitting and Under fitting of data • Over sampling of data • Scaling up for high dimensional data • Mining sequence and time series data • Difficulty in finding relevant information • Extracting new knowledge from the web 3. WEB CONTENT MINING Web content mining data may be structured or unstructured/semi structured even though much of web is unstructured. It is the process of retrieving the information from the web into more structured forms and indexing the information to retrieve quickly or finding valuable information from web content or web documents. Web content mining includes the web documents which may consist of text, html, multimedia documents i.e., images, audio, video and sound etc. The search result mining contains the web search results. It may be a structure documents or unstructured documents.
  • 4. International Journal of Computer-Aided Technologies (IJCAx) Vol.2, No.3, July 2015 58 Web content mining used many algorithms and tools such as Genetic algorithm, Cluster Hierarchy Construction Algorithm (CHCA), Correlation algorithm. Web Info Extractor (WIE), Mozenda, screen-scrapper, ontology based tools; web content extractor and automation anywhere are content mining tools. Cloud users require to extract the information from the cloud provided by web servers can make use of the web mining. For instance, Web communities can be maintained the information such as facebook. That is the users of same field of interest can be grouped and they can communicate through the network. Digital library performs automated citation indexing using web mining techniques. E-services include e-banking, search engines, on- line auctions, on-line knowledge management, social networking, e-learning, blog analysis, and personalization and recommendation systems. This can be analyzed for the customers and enable provision to the customers based on their recommendations [18]. It has two approaches; they are (i) Agent based and (ii) Database Approach. Figure 2 gives the web content mining approaches. Figure2. Web Content Mining approaches (i) Agent Based Approach Agent based approach focuses on searching relevant information from the World Wide Web. Three types of agents they are (i) Intelligent search agents – Automatically searches for information along with a particular query (ii) Information filtering/categorizing agents - Filters the data (ii) Personalized web agents – Discovers the documents those are related to the user profiles (iii)Database Approach Database approach consists of databases which contain attributes, tables and schema with defined domains. It focused on techniques for organizing the semi structured data on the web into more collections of resources, and using standard database querying mechanism and data mining techniques to analyze it, for example multilevel database and web querying system [5].
  • 5. International Journal of Computer-Aided Technologies (IJCAx) Vol.2, No.3, July 2015 59 Web content mining has the other approaches to mine the data. These are unstructured text data mining, structure mining, and semi-structure text mining and multimedia data mining [16]. 3.1 RESEARCH ISSUES ON WEB CONTENT MINING Web content mining has number of research issues because it can extract the information from the web search engines. • Data / Information Extraction concentrate on extraction of structured data from web pages such as products and search results. • Web information integration and schema matching. The web contains large amount of data, each website accept similar information in a different way. Similar data discovery is an important problem with lots of realistic applications. • Opinion extraction from online sources i.e. customer makes sure of products, forums, blogs and chat rooms. Mining opinions are of big consequence for marketing intelligence and product benchmarking. • Automatically segmenting web pages and detecting noise is an interesting problem in web application. It could not have advertisements, navigation links and copyrights notices. Hence, extracting the main content of the web page is important problem in web application [19]. 4. WEB STRUCTURE MINING Web structure mining is the study of data interconnected to the structure of a particular website. It consists of web graph which contains the web pages or web documents as nodes and hyperlinks as edges those are connecting between two related pages [7]. Figure 3 represents the web graph structure. Figure3. Web Graph Structure Web structure is useful source for extracting information. Web structure is to extract some interesting web graph patterns like co-citation, social choice, complete bipartite graphs, etc [1]. It classifies the web page on various topics and deciding which web page is to be added into the collection of web pages. Web structure mining can be performed either at intra-page level or inter-page level. A hyperlink that connects to a different part of the same page is called intra-page hyperlink. It is a document structure level [22].
  • 6. International Journal of Computer-Aided Technologies (IJCAx) Vol.2, No.3, July 2015 60 A hyperlink that connects two different pages are called inter-page hyperlink which is structure level [12]. Web page is organized in tree structure format based on HTML tags. Here, the documents are extracted automatically by the Document Object Model (DOM). The main reason for developing link mining is to understand the social organization of the web. The research of structure analysis is called Link mining [14] which is located in the connection of work in link analysis, hypertext and web mining, relational learning, inductive logic programming and graph mining. Some of the important tasks of link mining are link based classification, link based cluster analysis, link type, link strength and link cardinality. The research of the hyperlink level is also called hyperlink analysis [22] which can be used to retrieve useful information from the web [13]. Web structure mining is used in search engines such as Google, Yahoo, etc. HITS algorithm was used in clever search engine by IBM and the page rank algorithm is used by Google [11]. Algorithms of web structure mining are HITS (Hypertext Induced Topic Search) algorithm, Max flow- Min cut algorithm, ECLAT algorithm, and Page rank algorithm. Page rank algorithm can be divided into two types. One is weighted page rank algorithm and another one is Topic sensitive page rank algorithm. 4.1 RESEARCH ISSUES ON WEB STRUCTURE MINING Web structure mining has two issues due to its huge amount of data. • Reducing irrelevant search results. Relevance of search information becomes unorganized due to the problem search engines often only tolerate for low precision criteria. • Indexing information on the web [7]. This causes low amount of recall with content mining. 5. WEB USAGE MINING Web usage mining is also called as web log mining which is used to analyze the behaviour of online users [2]. It fed into two types of tracking; one is general access tracking and another one is customize usage tracking [3]. The general access tracking is used to predict the customer behaviour on the web and it identifies the user while the user interacts with the web. It can store the data automatically when the web server log and application log [15]. The web log is located in three different locations they are web server log, web proxy server and client browser and it contains only plain text file (.txt). The large amounts of irrelevant data are available in the web log file because it contains noisy data, large amount of incomplete, eroded and unnecessary information [6]. Web server log files are used to identify the errors and failed requests were given by the web master and the system administrator. Web usage mining is to extract the data which are stored in server access logs, referrer logs, agent logs and error logs. Web usage mining generally uses basic data mining algorithms such as association rule mining, sequential rule mining, clustering, and classification. It has several tools to analyze the behaviour of the user. They are KOINOTITES, web SIFT, web usage miner, INSITE, speed tracer, Archcollect, i-Miner, AWUSA, i-JADE web miner, Web Quilt, STRATDYN, SEWeP, webTool, MiDAS, web mate, WebLog Miner, DB2Intelligent miner of Data, Poly Analyst version 6.0, Clemetine, WEBMINER, WEBVIZ. Figure 4 shows the different web server logs [1].
  • 7. International Journal of Computer-Aided Technologies (IJCAx) Vol.2, No.3, July 2015 61 Figure4: Web Server Logs 5.1 WEB SERVER LOGS 5.1.1 Access log Access log is used to capture the information about the user and it has many numbers of attributes. It will record each click event, hits and access of the user. It is one of the web server logs [16]. 5.1.2 Agent log Agent log is used to record the details about online user behaviour, user’s browser, browser’s version and operating system. It is a standard log file while comparing the access log. 5.1.3 Error log When user click on a particular link and the browser does not display the particular page or website then the user receives error 404 not found. 5.1.4 Referrer log Referrer log is used to store the information of the URLs of web pages on other sites that link to web pages. That is, if a user gets to one of the server‘s pages by clicking on a link from another site, the URL of that site will appear in this log [21]. 5.2 PROCESS OF WEB USAGE MINING Web usage mining process is generally divided into three tasks: 5.2.1 Data pre-processing Web log data pre-processing is nothing but, to identify users, sessions, page views and so on. In order to improve the efficiency and scalability many steps are required, these are, data fusion,
  • 8. International Journal of Computer-Aided Technologies (IJCAx) Vol.2, No.3, July 2015 62 data cleaning, user identification by IP address, authentication data, cookies, client information and site topology, session identification, formatting, and path completion [6]. 5.2.2 Pattern discovery The data mining techniques and algorithms are used to perform in the pattern discovery by using clustering, association rules and sequential analysis. The association technique is mostly used in pattern discovery for detection of relation between visited pages by online users. It is used to extract patterns of usage from web data [4]. The extract pattern can be stand for in many ways such as graphs, charts, tables and forms. Figure5. Process of web usage mining 5.2.3 Pattern analysis The last process of the web usage mining is pattern analysis. There are so many techniques are used for pattern analysis such as visualization technique, OLAP technique, data and knowledge querying and usability analysis [4]. 5.3 RESEARCH ISSUES ON WEB USAGE MINING Web usage mining has several issues because it involves number of data mining techniques. The problems are [11] • Session identification • CGI data • Catching • Dynamic pages • Robot detection and filtering • Transaction identification
  • 9. International Journal of Computer-Aided Technologies (IJCAx) Vol.2, No.3, July 2015 63 6. CONCLUSION AND FUTURE WORK This paper has discussed about the research issues and challenges in web mining and also provided detailed review about the basic concepts of web mining, web content mining, structure mining, usage mining, tools, algorithms and types. Several open research issues and drawbacks which are exists in the current techniques are also discussed. This study and review would be helpful for researchers those who are doing their research in the domain of web mining. REFERNCES [1] Joy Shalom Sona, Prof. Asha Ambhaikar” A Reconciling Website System to Enhance Efficiency with Web Mining Techniques” International Journal Of Scientific & Engineering Research Volume 3, Issue 2, February-2012 1 ISSN 2229-5518 [2] Aparna Ranade, Abhijit R. Joshi, Ph. D,” Techniques for Understanding User Usage Behavior on the Internet” International Journal of Computer Applications (0975 – 8887) Volume 92 – No.7, April 2014 [3] Karan Bhalla & Deepak Prasad,” Data Preparation and Pattern Discovery For Web Usage Mining” [4] Amit Pratap Singh1, Dr. R. C. Jain 2,” A Survey on Different Phases of Web Usage Mining for Anomaly User Behavior Investigation” International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)Volume 3, Issue 3, May – June 2014 ISSN 2278-6856 [5] R. Lokeshkumar1, R. Sindhuja2, Dr. P. Sengottuvelan, “A Survey on Pre-processing of Web Log File in Web Usage Mining to Improve the Quality of Data” International Journal of Emerging Technology and Advanced Engineering, ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 4, Issue 8, August 2014 [6] Mitali Srivastava, Rakhi Garg, P. K. Mishra,” Preprocessing Techniques in Web Usage Mining: A Survey” International Journal of Computer Applications (0975 – 8887) Volume 97– No.18, July 2014 [7] http://www.slideshare.net/akhanna3/discovering-knowledge-using-web-structure-mining-27488978 [8] Ashish Kumar Garg, Mohammad Amir, Jarrar Ahmed, Man Singh, Sham Bansa,” Implementation of a Search Engine” International Journal of Science and Research (IJSR) ISSN (Online): 2319-7064 [9] C.Gomathi, M. Moorthi,” Web Access Pattern Algorithms in Education Domain” Computer and information science journal vol. 1, No.4, November 2008 [10] Md. Zahid Hasan, Khawja Jakaria Ahmad Chisty and Nur-E-Zaman Ayshik, “Research Challenges in Web Data Mining”, International Journal of Computer Science and Telecommunications Volume 3, Issue 7, July 2012 [11] Jaideep Srivastava, “Web Mining: Accomplishments & Future Directions”, University of Minnesota USA, srivasta@cs.umn.edu, http://www.cs.umn.edu/faculty/srivasta.html [12] http://www.slideshare.net/AmirFahmideh/web-mining-structure-mining [13] http://www.slideshare.net/Tommy96/web-mining-tutorial [14] http://www.faadooengineers.com/threads/2177-PPT-Link-Mining [15] Deepti Kapila, Prof. Charanjit Singh, “Survey on Page Ranking Algorithms for Digital Libraries”, International Journal of Advanced Research in Computer Science and Software Engineering Volume 4, Issue 6, June 2014 ISSN: 2277 128X [16] Alberto Sillitti, Marco Scotto, Giancarlo Succi, Tullio Vernazza,” News Miner: a Tool for Information Retrieval” [17] Sandhya, Mala chaturvedi, “a survey on web mining algorithms”, The International Journal Of Engineering And Science (IIJES) Volume 2 Issue 3 [18] Ananthi.J,” A Survey Web Content Mining Methods and Applications for Information Extraction from Online Shopping Sites”, International Journal of Computer Science and Information Technologies, Vol. 5 (3), 2014 [19] S.Balan, “A Study of Various Techniques of Web Content Mining Research Issues and Tools”, International journal of innovative research and studies ISSN 2319-9725
  • 10. International Journal of Computer-Aided Technologies (IJCAx) Vol.2, No.3, July 2015 64 [20] D.Jayalatchumy, Dr. P.Thambidurai, “Web Mining Research Issues and Future Directions – A Survey”, IOSR Journal of Computer Engineering (IOSR-JCE) e-ISSN: 2278-0661, p- ISSN: 2278- 8727Volume 14, Issue 3 [21] Naga Lakshmi, Raja Sekhara Rao , Sai Satyanarayana Reddy, “An Overview of Preprocessing on Web Log Data for Web Usage Analysis”, International Journal of Innovative Technology and Exploring Engineering (IJITEE) ISSN: 2278-3075, Volume-2, Issue-4 [22] Mamta M. Hegde, Prof. M.V.Phatak, “Developing an approach for hyperlink analysis with noise reduction using Web Structure Mining”, International Journal of Advanced Research in Computer Engineering & Technology Volume 1, Issue 3, May2012 [23] Mr. Dushyant B.Rathod, Dr.Samrat Khanna, “A Review on Emerging Trends of Web Mining and its Applications” ISSN: 2321-9939 BIOGRAPHY Dr. S. Vijayarani has completed MCA, M.Phil and Ph.D in Computer Science. She is working as Assistant Professor in the School of Computer Science and Engineering, Bharathiar University, Coimbatore. Her fields of research interest are data mining, privacy and security issues in data mining and data streams. She has published papers in the international journals and presented research papers in international and nationalconferences. Ms. E. Suganya has completed M.Sc in Computer Science. She is currently pursuing her M.Phil in Computer Science in the School of Computer Science and Engineering, Bharathiar University, Coimbatore. Her fields of interest are Data Mining and Web Mining.