SlideShare a Scribd company logo
1 of 9
Download to read offline
Scientific Journal Impact Factor (SJIF): 1.711
International Journal of Modern Trends in Engineering and
Research
www.ijmter.com
@IJMTER-2014, All rights Reserved 11
e-ISSN: 2349-9745
p-ISSN: 2393-8161
A Comparative Study of Recommendation System Using Web Usage
Mining
Mr. D. C. Karthick Kumar1
, Mr. R. Lokesh Kumar2
, Dr. P. Sengottuvelan3
PG Scholar1
, Assistant Professor2
, Associate Professor3
Department of IT1,2,3
Bannari Amman Institute of Technology1,2,3
Abstract—Web Mining is one of the Developing field in research. Exact custom of the Web is to get the
beneficial material in the sites. To reduce the work time of user the Web Usage Mining (WUM) technique
is introduced. In this Technique use Web Page recommendation for the Web request from the user. For
the recommendation system in Web Usage Mining (WUM) variousauthor has introduce different
Algorithm and technique to improve the user interest in surfing the Web. Web log files are used todefine
the user interest and there next recommend page to view.The data stored in the web log file consist of
large amount oferoded, incomplete, and unnecessary information. So, the Web log files have to pre-
process, customize, and to clean the data. In this paper we will survey different recommendation technique
to identify the issues in web surfing and to improve web usagemining (WUM) pre-processing for pattern
mining and analysis.
Keywords-Web Mining, Web Usage Mining, Pre-processing, Recommendation technique, Pattern
Mining
I. INTRODUCTION
With the rapid growth of the Web, it becomes more and more difficult for Web users to find useful
information. In particular, a Web user often wanders aimless on the Web without visiting pages of his/her
interests, or spends a long time to find the expected information. Web page recommendation is thus
proposed to address this problem. It aims to understand the users’ behaviors, and guide users to visit pages
of their interests at a specific time. An essential task of Web page recommendation is to understand users’
navigation behaviors from their Web usage data, and devise a model to predict what pages the users are
more likely to visit at the next step.
Web Usage Mining is the submission of data mining techniques to determine motivating usage
shapes from Web data in order to know and well aid the needs of Web-based tenders. Usage data seizures
the self or cause of web users sideways with their browsing performance at a web site. Web usage mining
that one can be confidential extra dependent on the kind of tradition data measured:
Web Server Data: The user logs are collected by the web server. It contains various unwanted data
like IP address, page reference and access time.
Application Server Data: Profitable bid aides have substantial landscapes to qualify e-commerce
tenders to be built on top of them with slight power. A key article is the talent to roadway countless
breeds of commerce dealings and records them in submission attendant woods.
International Journal of Modern Trends in Engineering and Research (IJMTER)
Volume 01, Issue 05, [November - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161
@IJMTER-2014, All rights Reserved 12
Application Level Data: Novel kinds of actions can be well-defined in a claim, and sorting can be
bowed on for them thus producing antiquities of these especially definite events. It must be famous, but,
that many end claims require a blend of one or more of the procedures useful in the groupings overhead.
Studies correlated to work are afraid with two expanses: constraint-based data mining systems applied in
Web Usage Mining and settled software riggings.
Figure 1. Classification of Web Usage Mining
Agent log file is used to record the information about user’s browser, browsers version and
operating system. Different versions of different users browsing history are very helpful for designer and
web site changes are made accordingly.
Access log file is one of the major web log server, it will record each click, hits and access of the
users for capturing information about user, can use number of attributes. Table 1.describes different
attributes of access log file along with their explanation.
Table 1. Attributes of Log File and Description
Attributes Description
Client IP Client machine IP Address
Client name
Client Name if required by server
otherwise hyphen(-)
Date
Date is recorded when User Made access
and transfer.
Time Time of transformation is recorded
International Journal of Modern Trends in Engineering and Research (IJMTER)
Volume 01, Issue 05, [November - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161
@IJMTER-2014, All rights Reserved 13
Server site name
Internet service name as appeared on
client machine
Server Computer name Server name
Server IP
Server IP provided by internet service
provider
Server port
Server port configured for data
transformation
Client server URL
stream
Targeted default web page of web site
Client server URL
Query
Client query which starts after “?”
Server client status Status code returned by server link
Server client Bytes Number of bytes sent by server to client
Client server Bytes Number of bytes received by client
Client server methods
Client method or model of request can be
Get, POST or HEAD
Time taken
How much spent by client to perform an
action
Client server version Protocol version like HTTP
Client server host Host header name
User agent Browser type that client used
Cookies Contents of cookies
Referrer Link from where client jump to this site
Server client win32
status
Windows status code
II. MOTIVATION
Preprocessing is the essential step for web usage mining. Preprocessing phase will occur after
creating a web log file. A web log file is an input of preprocessing phase of web usage mining. Web log
file is large in large, contains number of raw datas, images, audio/ video files. The main aim of
preprocessing is to make an unstructures/semistructured web log file data into a structured form by
International Journal of Modern Trends in Engineering and Research (IJMTER)
Volume 01, Issue 05, [November - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161
@IJMTER-2014, All rights Reserved 14
eliminating the unwanted datas from log. We came to a conclusion that preprocessing is very important in
web usage mining. Preprocessing steps increases the quality of data and improves the effictiveness and
efficiency of the other steps in web usage mining like pattern discovery and pattern analyais.
Figure 2. Pre-processing Technique Process
III. LITERATURE STUDY
The focus of literature review is to study, compare and contrast the available Recommendation
system techniques. Due to large amount of irrelevant entries in the web log file, the original log file cannot
be directly used in WUM process. Therefore, the pre-processing of web log file becomes more essential.
Authors Xiaogang ,Wang Yan Bai and Yue Li from Wuhan University of Science and Engineering
in china has made a work fashionable An Material Recapture Way Based On Serial Admittance Patterns
they has proposed an intelligent web recommender organization known as WAPPS based on successive
web admission decorations. In that the successive shape mining algorithm CS-mine is used to mine
everyday sequential web access patterns. The mined shapes are stored in the Pattern-tree, which is then
used for similar and creating web links for operational approvals. The projected system has completed
upright act with high pleasure and applicability.
In every Sequential Pattern Mining the web log should containdate-timestamp, client IP address,
user ID, requested URL, and HTTP status code. With a various sample database of web access sequences
the support threshold MinSup = 3 in sample database. In pattern-tree construction algorithm is based on
the set of sequential web access patterns mined by the sequential pattern mining component using cs-mine.
Recommendation Rules Generation Algorithm is also used for shorter matching paths usually have lower
accuracy in the Web log files. With various formula they have produce the complexity Analysis.
Authors Ralf Krestel and Peter Fankhauser from Germany made work in topic Language Models
and Topic Models for Personalizing Tag Recommendation in thisthey introduce an approach to
International Journal of Modern Trends in Engineering and Research (IJMTER)
Volume 01, Issue 05, [November - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161
@IJMTER-2014, All rights Reserved 15
personalized tag recommendation that combines a probabilistic model of tags from the resource with tags
from the user. As models we investigate simple language models as well as Latent Dirichlet Allocation.
Extensive experiments on a real world dataset crawled from a big tagging system show that personalization
improves tag recommendation, and our approach significantly outperforms state-of-the-art approaches.
In their work they have travelled user-centered and source- centered methods for modified tag
endorsement. We likened and active a linguistic presentation tactic and a method based on Latent Dirichlet
Distribution. We additional- more methodically examined the use of linguistic replicas and LDA for tag
endorsement presentation that simple linguistic replicas constructed from users and possessions harvest
modest presentation while overwhelming only a portion of the computational costs likened to more cultured
procedures. We presented that the grouping of together approaches (LDA and LM) custom-made to
manipulators and assets outstrip state-of-the-art tag recommendation procedures with admiration to a
comprehensive diversity of performance metrics. It would also be stimulating to see whether the
performance of the procedures variations after functional to a snap or video classification structure in its
place of a labelling scheme for web folios. Finally, we also design to explore how added circumstantial
information such as time, position, and existing mission can be charity to additional advance tag
endorsement.
Authors Shiva Nadi, Mohamad Saraee, Mohamad Davarpanah-Jazi from Islamic Azad University
of Najafabad in Iran has made a research on a topic A Fuzzy Recommender System for Dynamic Prediction
of User's Behavior with various algorithm and fuzzy clustering techniques. They make the preprocessing
in the raw log files whichhave useful information about access of all users to a specific website. They have
integrated web content mining to web usage mining for finding users interesting rules with Fuzzy clustering
techniques.
Knowledge discovery Process is created with various steps has follow they are as preprocessing for
remove the unwanted images and data in log files by data cleaning process. In Web content mining they
use the document clustering to group the pages in content based clusters. Integrate the log files with website
pages to give recommendation for on-line user’s for the first time in sites.Recommendation Process is
provide if a new user starts a transaction, our model matches the new user with the most similar user clusters
and provides suitable recommendations to him as Support Identification. In Match Score Identification has
match score calculation defines highest match user cluster for active user also it give us a list of
corresponding user clusters from the highest match score down to lowest match score.
For the Experiment they have use the log files from Information and Communication Technology
Center of Isfahan municipality in Iran (F A VA) for IP address 80.191.136.6 for a period of one week. They
have produced an effective result with the simulation experiment.
Authors Mozhgan Azimpour-Kivi form Sharif University of Technology in Iran and Reza Azmi
from Alzahra University in Iran has made work with A Webpage Similarity Measure for Web Sessions
Clustering Using Sequence Alignment. Web sessions clustering is a process of web usage mining task that
aims to group web sessions with similar trends and usage patterns into clusters. This process is used to
improve the web management and to provide the efficient web recommender system for the user. The
proposed techniques similarity measure for comparing web sessions with sequential Alignment method
and use two web sessions based on the time a user spends on a webpage, and also the frequency of visit of
each page within the session.
In their work they have compare URLs for web pages similarity and also the importance of the user
results in web similarity sessions. In URL they not consider the content of web page only path leading to a
International Journal of Modern Trends in Engineering and Research (IJMTER)
Volume 01, Issue 05, [November - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161
@IJMTER-2014, All rights Reserved 16
web page by making the tree structure to determine the length of the longest token string among two
websites. In Web Page Similarity Based on the Importance to the User is based on the interesting of the
user by visiting the pages within the time session. It is calculated and compare with two pages to provide
recommendation for the next users. With the similarity of web session they apply the Sequence Alignment
method with the various formulas they have calculate similarity between two URLs in the webpages used
by interesting user. In the feature we have a better estimation of the similarity of two web pages, we can
use other methods proposed for web content mining such as Information Retrieval or semantic web
approaches. Furthermore, for having a more general evaluation, we can use a larger collection of web
sessions data and apply different clustering algorithms on these data.
Table 2: Various Recommendation Systems and Techniques
Author(s) Name Name of Paper Year of
Publication
Technique Used
Xiaogang Wang,
Yan Bai, Yue Li
An Information Retrieval
Method Based On
Sequential Access
Patterns
2010
Sequential Pattern
Mining
Ralf Krestel,
Peter Frankhauser
Language Models &
Topic Models for
Personalizing Tag
Recommendation
2010
Language Models and
Latent Dirichlet
Allocation
Shiva Nadi,
Mohamad Saraee,
Mohamad
Davarpanah-Jazi
A Fuzzy Recommender
System for Dynamic
Prediction of User's
Behavior
2010
Fuzzy Clustering
Technique
MozhganAzimpo
ur-Kivi, Reza
Azmi
A Webpage Similarity
Measure for Web
Sessions Clustering
Using Sequence
Alignment
2011
Sequence Alignment
Method
IV. CONCLUSION AND FUTURE WORK
Preprocessing of web log file is first necessary and important process for web usage mining.
Cleaned data after preprocessing is solid base for pattern mining and pattern analysis. Quality of pattern
mining and pattern analysis is fully dependent on preprocessing process. In this survey, we summarized
the existing web log preprocessing techniques and concluded some results. Most authentic source for web
usage mining considered Server log file. So it must be standardized and needs to be updated to capture user
access data. Some of preprocessing techniques are applied but we can use less or even ignored
preprocessing techniques to improve the quality of preprocessed data. For future work we should explore
preprocessing techniques and use them with the combination of existing techniques to make the whole
process more robust. New practices can offer the user to investigate the log file at changed level of thought
such as user assemblies to advantage improved considerate of log file we need ordered huddling by using
wished-for bundling system.
International Journal of Modern Trends in Engineering and Research (IJMTER)
Volume 01, Issue 05, [November - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161
@IJMTER-2014, All rights Reserved 17
REFERENCES
[1] M. Eirinaki and M. Vazirgiannis, “Web mining for web personalization”, ACM Transactions on Internet Technology, Vol.
3,No. 1, 2003, pp. 1-27.
[2] J. Konstan, B. Miller, D. Maltz, J. Herlocker, L. Gordon, and J. Riedl,“GroupLens: applying collaborative filtering to
usenet news”, Communications of the ACM, 40(3), 1997, pp. 77-87.
[3] T. Joachims, D. Freitag, and T. Mitchel, “WebWatcher: a tour guide for the World Wide Web”, Proc. of the 5th
International Joint Conference on AI, Japan, 1997, pp. 770-775.
[4] B. Mobasher, H. Dai, T. Luo and M. Nakagawa, “Effective personalization based on association rule discovery from web
usagedata”, Proc. of the 3rd ACM Workshop on Web Information and Data Management (WIDM01), 2001.
[5] R. Cooley, B. Mobasher, and J. Srivastava, “Data preparation for mining World Wide Web browsing patterns”, Journal of
Knowledge and Information Systems, Vol. 1, No. 1. 1999.
[6] H. Halpin, V. Robu, and H. Shepherd, “The complex dynamics of collaborative tagging,” in Proceedings of the 16th
International Conference on World Wide Web (WWW 2007). ACM, 2007, pp. 211–220.
[7] S. A. Golder and B. A. Huberman, “The structure of collaborative tagging systems,” CoRR, vol. abs/cs/0508082, 2005.
[8] D. Bollen and H. Halpin, “An experimental analysis of suggestions in collaborative tagging,” in 2009 IEEE/WIC/ACM
International Conferenceon Web Intelligence, WI 2009, Milan, Italy, 15-18 September 2009,Main Conference
Proceedings. IEEE, 2009, pp. 108–115.
[9] M. P. Singh, "Web Usage Mining and Personalization" , a chapter in Practical Handbook of Internet Computing, Munindar
P. Singh (ed.), CRC Press, 2005.
[10]G. Castellano, A M. Fanelli, C. Mencar and M. Alessandra Torsello, "Similarity-based Fuzzy clustering for user profiling",
IEEEIWICIACM International Conferences on Web Intelligence and Intelligent Agent Technology - Workshops, 2007.
[11]G. Castellano, AM. Fanelli, P. Plantamura, M.A Torsello, A Neuro-Fuzzy Strategy for Web Personalization, Proceedings
of the Twenty-Third AAAI Conference on Artificial Intelligence, 2008
[12]VA Koutsonikola and AI. akali, 2009. A fuzzy bi clustering approach to correlate web users nad web pages,
IntI.Knowledge and Web Intelligence, VoU, Nos.1/2, pp. 3- 23
[13]D.H.Kraft, J. Chen, M. Bautista, MJ., and MA Vila, "Textual Information Retrieval with User Profiles Using Fuzzy
Clustering and," Intelligent Exploration of the Web,Heidelberg, Germany: Physica-Verlag, 2002.
[14]S. Taherizadeh and N. Moghadam, 2009. Integrating web content mining into web usage mining for finding patterns and
predicting users behaviors, International Journal of Information Science and Management, Vol, 7, No.1, pp.51-66
[15]T. Hussain, S. Asghar, and S. Fong, “A hierarchical cluster based preprocessing methodology for Web Usage Mining,” in
6th
International Conference on Advanced Information Management and Service (IMS), 2010, pp. 472-477.
[16]P. J. Rousseeuw, “Silhouettes: A graphical aid to the interpretation and validation of cluster analysis,” Journal of
Computational and Applied Mathematics, vol. 20, pp. 53-65, Nov. 1987.
A Comparative Study of Recommendation System Using Web Usage Mining
A Comparative Study of Recommendation System Using Web Usage Mining

More Related Content

What's hot

Odam an optimized distributed association rule mining algorithm (synopsis)
Odam an optimized distributed association rule mining algorithm (synopsis)Odam an optimized distributed association rule mining algorithm (synopsis)
Odam an optimized distributed association rule mining algorithm (synopsis)
Mumbai Academisc
 
Applying web mining application for user behavior understanding
Applying web mining application for user behavior understandingApplying web mining application for user behavior understanding
Applying web mining application for user behavior understanding
Zakaria Zubi
 

What's hot (18)

International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
Web personalization using clustering of web usage data
Web personalization using clustering of web usage dataWeb personalization using clustering of web usage data
Web personalization using clustering of web usage data
 
BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...
BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...
BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...
 
Odam an optimized distributed association rule mining algorithm (synopsis)
Odam an optimized distributed association rule mining algorithm (synopsis)Odam an optimized distributed association rule mining algorithm (synopsis)
Odam an optimized distributed association rule mining algorithm (synopsis)
 
MULTIFACTOR NAÏVE BAYES CLASSIFICATION FOR THE SLOW LEARNER PREDICTION OVER M...
MULTIFACTOR NAÏVE BAYES CLASSIFICATION FOR THE SLOW LEARNER PREDICTION OVER M...MULTIFACTOR NAÏVE BAYES CLASSIFICATION FOR THE SLOW LEARNER PREDICTION OVER M...
MULTIFACTOR NAÏVE BAYES CLASSIFICATION FOR THE SLOW LEARNER PREDICTION OVER M...
 
WEB MINING – A CATALYST FOR E-BUSINESS
WEB MINING – A CATALYST FOR E-BUSINESSWEB MINING – A CATALYST FOR E-BUSINESS
WEB MINING – A CATALYST FOR E-BUSINESS
 
a novel technique to pre-process web log data using sql server management studio
a novel technique to pre-process web log data using sql server management studioa novel technique to pre-process web log data using sql server management studio
a novel technique to pre-process web log data using sql server management studio
 
A comprehensive study of mining web data
A comprehensive study of mining web dataA comprehensive study of mining web data
A comprehensive study of mining web data
 
Framework for web personalization using web mining
Framework for web personalization using web miningFramework for web personalization using web mining
Framework for web personalization using web mining
 
Web Usage Mining: A Survey on User's Navigation Pattern from Web Logs
Web Usage Mining: A Survey on User's Navigation Pattern from Web LogsWeb Usage Mining: A Survey on User's Navigation Pattern from Web Logs
Web Usage Mining: A Survey on User's Navigation Pattern from Web Logs
 
Website Content Analysis Using Clickstream Data and Apriori Algorithm
Website Content Analysis Using Clickstream Data and Apriori AlgorithmWebsite Content Analysis Using Clickstream Data and Apriori Algorithm
Website Content Analysis Using Clickstream Data and Apriori Algorithm
 
Applying web mining application for user behavior understanding
Applying web mining application for user behavior understandingApplying web mining application for user behavior understanding
Applying web mining application for user behavior understanding
 
Preprocessing of Web Log Data for Web Usage Mining
Preprocessing of Web Log Data for Web Usage MiningPreprocessing of Web Log Data for Web Usage Mining
Preprocessing of Web Log Data for Web Usage Mining
 
Bb31269380
Bb31269380Bb31269380
Bb31269380
 
Design and Implementation of SOA Enhanced Semantic Information Retrieval web ...
Design and Implementation of SOA Enhanced Semantic Information Retrieval web ...Design and Implementation of SOA Enhanced Semantic Information Retrieval web ...
Design and Implementation of SOA Enhanced Semantic Information Retrieval web ...
 
Optimization of Search Results with Duplicate Page Elimination using Usage Data
Optimization of Search Results with Duplicate Page Elimination using Usage DataOptimization of Search Results with Duplicate Page Elimination using Usage Data
Optimization of Search Results with Duplicate Page Elimination using Usage Data
 
A270104
A270104A270104
A270104
 
Web Mining
Web Mining Web Mining
Web Mining
 

Similar to A Comparative Study of Recommendation System Using Web Usage Mining

MULTIFACTOR NAÏVE BAYES CLASSIFICATION FOR THE SLOW LEARNER PREDICTION OVER M...
MULTIFACTOR NAÏVE BAYES CLASSIFICATION FOR THE SLOW LEARNER PREDICTION OVER M...MULTIFACTOR NAÏVE BAYES CLASSIFICATION FOR THE SLOW LEARNER PREDICTION OVER M...
MULTIFACTOR NAÏVE BAYES CLASSIFICATION FOR THE SLOW LEARNER PREDICTION OVER M...
ijcsa
 
Data preparation for mining world wide web browsing patterns (1999)
Data preparation for mining world wide web browsing patterns (1999)Data preparation for mining world wide web browsing patterns (1999)
Data preparation for mining world wide web browsing patterns (1999)
OUM SAOKOSAL
 

Similar to A Comparative Study of Recommendation System Using Web Usage Mining (20)

Implementation of Intelligent Web Server Monitoring
Implementation of Intelligent Web Server MonitoringImplementation of Intelligent Web Server Monitoring
Implementation of Intelligent Web Server Monitoring
 
C017231726
C017231726C017231726
C017231726
 
A Novel Method for Data Cleaning and User- Session Identification for Web Mining
A Novel Method for Data Cleaning and User- Session Identification for Web MiningA Novel Method for Data Cleaning and User- Session Identification for Web Mining
A Novel Method for Data Cleaning and User- Session Identification for Web Mining
 
Performance of Real Time Web Traffic Analysis Using Feed Forward Neural Netw...
Performance of Real Time Web Traffic Analysis Using Feed  Forward Neural Netw...Performance of Real Time Web Traffic Analysis Using Feed  Forward Neural Netw...
Performance of Real Time Web Traffic Analysis Using Feed Forward Neural Netw...
 
Research Paper
Research PaperResearch Paper
Research Paper
 
AN EXTENSIVE LITERATURE SURVEY ON COMPREHENSIVE RESEARCH ACTIVITIES OF WEB US...
AN EXTENSIVE LITERATURE SURVEY ON COMPREHENSIVE RESEARCH ACTIVITIES OF WEB US...AN EXTENSIVE LITERATURE SURVEY ON COMPREHENSIVE RESEARCH ACTIVITIES OF WEB US...
AN EXTENSIVE LITERATURE SURVEY ON COMPREHENSIVE RESEARCH ACTIVITIES OF WEB US...
 
MULTIFACTOR NAÏVE BAYES CLASSIFICATION FOR THE SLOW LEARNER PREDICTION OVER M...
MULTIFACTOR NAÏVE BAYES CLASSIFICATION FOR THE SLOW LEARNER PREDICTION OVER M...MULTIFACTOR NAÏVE BAYES CLASSIFICATION FOR THE SLOW LEARNER PREDICTION OVER M...
MULTIFACTOR NAÏVE BAYES CLASSIFICATION FOR THE SLOW LEARNER PREDICTION OVER M...
 
Enactment of Firefly Algorithm and Fuzzy C-Means Clustering For Consumer Requ...
Enactment of Firefly Algorithm and Fuzzy C-Means Clustering For Consumer Requ...Enactment of Firefly Algorithm and Fuzzy C-Means Clustering For Consumer Requ...
Enactment of Firefly Algorithm and Fuzzy C-Means Clustering For Consumer Requ...
 
Recommendation generation by integrating sequential
Recommendation generation by integrating sequentialRecommendation generation by integrating sequential
Recommendation generation by integrating sequential
 
Recommendation generation by integrating sequential pattern mining and semantics
Recommendation generation by integrating sequential pattern mining and semanticsRecommendation generation by integrating sequential pattern mining and semantics
Recommendation generation by integrating sequential pattern mining and semantics
 
A Survey on: Utilizing of Different Features in Web Behavior Prediction
A Survey on: Utilizing of Different Features in Web Behavior PredictionA Survey on: Utilizing of Different Features in Web Behavior Prediction
A Survey on: Utilizing of Different Features in Web Behavior Prediction
 
Pxc3893553
Pxc3893553Pxc3893553
Pxc3893553
 
IRJET-A Survey on Web Personalization of Web Usage Mining
IRJET-A Survey on Web Personalization of Web Usage MiningIRJET-A Survey on Web Personalization of Web Usage Mining
IRJET-A Survey on Web Personalization of Web Usage Mining
 
Data preparation for mining world wide web browsing patterns (1999)
Data preparation for mining world wide web browsing patterns (1999)Data preparation for mining world wide web browsing patterns (1999)
Data preparation for mining world wide web browsing patterns (1999)
 
Web usage Mining Based on Request Dependency Graph
Web usage Mining Based on Request Dependency GraphWeb usage Mining Based on Request Dependency Graph
Web usage Mining Based on Request Dependency Graph
 
Automatic Recommendation for Online Users Using Web Usage Mining
Automatic Recommendation for Online Users Using Web Usage Mining Automatic Recommendation for Online Users Using Web Usage Mining
Automatic Recommendation for Online Users Using Web Usage Mining
 
Automatic recommendation for online users using web usage mining
Automatic recommendation for online users using web usage miningAutomatic recommendation for online users using web usage mining
Automatic recommendation for online users using web usage mining
 
IRJET- Enhancing Prediction of User Behavior on the Basic of Web Logs
IRJET- Enhancing Prediction of User Behavior on the Basic of Web LogsIRJET- Enhancing Prediction of User Behavior on the Basic of Web Logs
IRJET- Enhancing Prediction of User Behavior on the Basic of Web Logs
 
Search Engine Scrapper
Search Engine ScrapperSearch Engine Scrapper
Search Engine Scrapper
 
A NEW IMPROVED WEIGHTED ASSOCIATION RULE MINING WITH DYNAMIC PROGRAMMING APPR...
A NEW IMPROVED WEIGHTED ASSOCIATION RULE MINING WITH DYNAMIC PROGRAMMING APPR...A NEW IMPROVED WEIGHTED ASSOCIATION RULE MINING WITH DYNAMIC PROGRAMMING APPR...
A NEW IMPROVED WEIGHTED ASSOCIATION RULE MINING WITH DYNAMIC PROGRAMMING APPR...
 

More from Editor IJMTER

A NEW DATA ENCODER AND DECODER SCHEME FOR NETWORK ON CHIP
A NEW DATA ENCODER AND DECODER SCHEME FOR  NETWORK ON CHIPA NEW DATA ENCODER AND DECODER SCHEME FOR  NETWORK ON CHIP
A NEW DATA ENCODER AND DECODER SCHEME FOR NETWORK ON CHIP
Editor IJMTER
 
A CAR POOLING MODEL WITH CMGV AND CMGNV STOCHASTIC VEHICLE TRAVEL TIMES
A CAR POOLING MODEL WITH CMGV AND CMGNV STOCHASTIC VEHICLE TRAVEL TIMESA CAR POOLING MODEL WITH CMGV AND CMGNV STOCHASTIC VEHICLE TRAVEL TIMES
A CAR POOLING MODEL WITH CMGV AND CMGNV STOCHASTIC VEHICLE TRAVEL TIMES
Editor IJMTER
 
Textual Data Partitioning with Relationship and Discriminative Analysis
Textual Data Partitioning with Relationship and Discriminative AnalysisTextual Data Partitioning with Relationship and Discriminative Analysis
Textual Data Partitioning with Relationship and Discriminative Analysis
Editor IJMTER
 
SURVEY OF TRUST BASED BLUETOOTH AUTHENTICATION FOR MOBILE DEVICE
SURVEY OF TRUST BASED BLUETOOTH AUTHENTICATION FOR MOBILE DEVICESURVEY OF TRUST BASED BLUETOOTH AUTHENTICATION FOR MOBILE DEVICE
SURVEY OF TRUST BASED BLUETOOTH AUTHENTICATION FOR MOBILE DEVICE
Editor IJMTER
 
Software Quality Analysis Using Mutation Testing Scheme
Software Quality Analysis Using Mutation Testing SchemeSoftware Quality Analysis Using Mutation Testing Scheme
Software Quality Analysis Using Mutation Testing Scheme
Editor IJMTER
 
Software Defect Prediction Using Local and Global Analysis
Software Defect Prediction Using Local and Global AnalysisSoftware Defect Prediction Using Local and Global Analysis
Software Defect Prediction Using Local and Global Analysis
Editor IJMTER
 

More from Editor IJMTER (20)

A NEW DATA ENCODER AND DECODER SCHEME FOR NETWORK ON CHIP
A NEW DATA ENCODER AND DECODER SCHEME FOR  NETWORK ON CHIPA NEW DATA ENCODER AND DECODER SCHEME FOR  NETWORK ON CHIP
A NEW DATA ENCODER AND DECODER SCHEME FOR NETWORK ON CHIP
 
A RESEARCH - DEVELOP AN EFFICIENT ALGORITHM TO RECOGNIZE, SEPARATE AND COUNT ...
A RESEARCH - DEVELOP AN EFFICIENT ALGORITHM TO RECOGNIZE, SEPARATE AND COUNT ...A RESEARCH - DEVELOP AN EFFICIENT ALGORITHM TO RECOGNIZE, SEPARATE AND COUNT ...
A RESEARCH - DEVELOP AN EFFICIENT ALGORITHM TO RECOGNIZE, SEPARATE AND COUNT ...
 
Analysis of VoIP Traffic in WiMAX Environment
Analysis of VoIP Traffic in WiMAX EnvironmentAnalysis of VoIP Traffic in WiMAX Environment
Analysis of VoIP Traffic in WiMAX Environment
 
A Hybrid Cloud Approach for Secure Authorized De-Duplication
A Hybrid Cloud Approach for Secure Authorized De-DuplicationA Hybrid Cloud Approach for Secure Authorized De-Duplication
A Hybrid Cloud Approach for Secure Authorized De-Duplication
 
Aging protocols that could incapacitate the Internet
Aging protocols that could incapacitate the InternetAging protocols that could incapacitate the Internet
Aging protocols that could incapacitate the Internet
 
A Cloud Computing design with Wireless Sensor Networks For Agricultural Appli...
A Cloud Computing design with Wireless Sensor Networks For Agricultural Appli...A Cloud Computing design with Wireless Sensor Networks For Agricultural Appli...
A Cloud Computing design with Wireless Sensor Networks For Agricultural Appli...
 
A CAR POOLING MODEL WITH CMGV AND CMGNV STOCHASTIC VEHICLE TRAVEL TIMES
A CAR POOLING MODEL WITH CMGV AND CMGNV STOCHASTIC VEHICLE TRAVEL TIMESA CAR POOLING MODEL WITH CMGV AND CMGNV STOCHASTIC VEHICLE TRAVEL TIMES
A CAR POOLING MODEL WITH CMGV AND CMGNV STOCHASTIC VEHICLE TRAVEL TIMES
 
Sustainable Construction With Foam Concrete As A Green Green Building Material
Sustainable Construction With Foam Concrete As A Green Green Building MaterialSustainable Construction With Foam Concrete As A Green Green Building Material
Sustainable Construction With Foam Concrete As A Green Green Building Material
 
USE OF ICT IN EDUCATION ONLINE COMPUTER BASED TEST
USE OF ICT IN EDUCATION ONLINE COMPUTER BASED TESTUSE OF ICT IN EDUCATION ONLINE COMPUTER BASED TEST
USE OF ICT IN EDUCATION ONLINE COMPUTER BASED TEST
 
Textual Data Partitioning with Relationship and Discriminative Analysis
Textual Data Partitioning with Relationship and Discriminative AnalysisTextual Data Partitioning with Relationship and Discriminative Analysis
Textual Data Partitioning with Relationship and Discriminative Analysis
 
Testing of Matrices Multiplication Methods on Different Processors
Testing of Matrices Multiplication Methods on Different ProcessorsTesting of Matrices Multiplication Methods on Different Processors
Testing of Matrices Multiplication Methods on Different Processors
 
Survey on Malware Detection Techniques
Survey on Malware Detection TechniquesSurvey on Malware Detection Techniques
Survey on Malware Detection Techniques
 
SURVEY OF TRUST BASED BLUETOOTH AUTHENTICATION FOR MOBILE DEVICE
SURVEY OF TRUST BASED BLUETOOTH AUTHENTICATION FOR MOBILE DEVICESURVEY OF TRUST BASED BLUETOOTH AUTHENTICATION FOR MOBILE DEVICE
SURVEY OF TRUST BASED BLUETOOTH AUTHENTICATION FOR MOBILE DEVICE
 
SURVEY OF GLAUCOMA DETECTION METHODS
SURVEY OF GLAUCOMA DETECTION METHODSSURVEY OF GLAUCOMA DETECTION METHODS
SURVEY OF GLAUCOMA DETECTION METHODS
 
Survey: Multipath routing for Wireless Sensor Network
Survey: Multipath routing for Wireless Sensor NetworkSurvey: Multipath routing for Wireless Sensor Network
Survey: Multipath routing for Wireless Sensor Network
 
Step up DC-DC Impedance source network based PMDC Motor Drive
Step up DC-DC Impedance source network based PMDC Motor DriveStep up DC-DC Impedance source network based PMDC Motor Drive
Step up DC-DC Impedance source network based PMDC Motor Drive
 
SPIRITUAL PERSPECTIVE OF AUROBINDO GHOSH’S PHILOSOPHY IN TODAY’S EDUCATION
SPIRITUAL PERSPECTIVE OF AUROBINDO GHOSH’S PHILOSOPHY IN TODAY’S EDUCATIONSPIRITUAL PERSPECTIVE OF AUROBINDO GHOSH’S PHILOSOPHY IN TODAY’S EDUCATION
SPIRITUAL PERSPECTIVE OF AUROBINDO GHOSH’S PHILOSOPHY IN TODAY’S EDUCATION
 
Software Quality Analysis Using Mutation Testing Scheme
Software Quality Analysis Using Mutation Testing SchemeSoftware Quality Analysis Using Mutation Testing Scheme
Software Quality Analysis Using Mutation Testing Scheme
 
Software Defect Prediction Using Local and Global Analysis
Software Defect Prediction Using Local and Global AnalysisSoftware Defect Prediction Using Local and Global Analysis
Software Defect Prediction Using Local and Global Analysis
 
Software Cost Estimation Using Clustering and Ranking Scheme
Software Cost Estimation Using Clustering and Ranking SchemeSoftware Cost Estimation Using Clustering and Ranking Scheme
Software Cost Estimation Using Clustering and Ranking Scheme
 

Recently uploaded

Digital Communication Essentials: DPCM, DM, and ADM .pptx
Digital Communication Essentials: DPCM, DM, and ADM .pptxDigital Communication Essentials: DPCM, DM, and ADM .pptx
Digital Communication Essentials: DPCM, DM, and ADM .pptx
pritamlangde
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
Epec Engineered Technologies
 
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills KuwaitKuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
jaanualu31
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
mphochane1998
 
Query optimization and processing for advanced database systems
Query optimization and processing for advanced database systemsQuery optimization and processing for advanced database systems
Query optimization and processing for advanced database systems
meharikiros2
 

Recently uploaded (20)

Digital Communication Essentials: DPCM, DM, and ADM .pptx
Digital Communication Essentials: DPCM, DM, and ADM .pptxDigital Communication Essentials: DPCM, DM, and ADM .pptx
Digital Communication Essentials: DPCM, DM, and ADM .pptx
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
 
Introduction to Geographic Information Systems
Introduction to Geographic Information SystemsIntroduction to Geographic Information Systems
Introduction to Geographic Information Systems
 
Introduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdfIntroduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdf
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 
8086 Microprocessor Architecture: 16-bit microprocessor
8086 Microprocessor Architecture: 16-bit microprocessor8086 Microprocessor Architecture: 16-bit microprocessor
8086 Microprocessor Architecture: 16-bit microprocessor
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.
 
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills KuwaitKuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
 
Convergence of Robotics and Gen AI offers excellent opportunities for Entrepr...
Convergence of Robotics and Gen AI offers excellent opportunities for Entrepr...Convergence of Robotics and Gen AI offers excellent opportunities for Entrepr...
Convergence of Robotics and Gen AI offers excellent opportunities for Entrepr...
 
PE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and propertiesPE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and properties
 
Query optimization and processing for advanced database systems
Query optimization and processing for advanced database systemsQuery optimization and processing for advanced database systems
Query optimization and processing for advanced database systems
 
Path loss model, OKUMURA Model, Hata Model
Path loss model, OKUMURA Model, Hata ModelPath loss model, OKUMURA Model, Hata Model
Path loss model, OKUMURA Model, Hata Model
 
AIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsAIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech students
 
Max. shear stress theory-Maximum Shear Stress Theory ​ Maximum Distortional ...
Max. shear stress theory-Maximum Shear Stress Theory ​  Maximum Distortional ...Max. shear stress theory-Maximum Shear Stress Theory ​  Maximum Distortional ...
Max. shear stress theory-Maximum Shear Stress Theory ​ Maximum Distortional ...
 
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
 
Linux Systems Programming: Inter Process Communication (IPC) using Pipes
Linux Systems Programming: Inter Process Communication (IPC) using PipesLinux Systems Programming: Inter Process Communication (IPC) using Pipes
Linux Systems Programming: Inter Process Communication (IPC) using Pipes
 
Introduction to Artificial Intelligence ( AI)
Introduction to Artificial Intelligence ( AI)Introduction to Artificial Intelligence ( AI)
Introduction to Artificial Intelligence ( AI)
 
Ground Improvement Technique: Earth Reinforcement
Ground Improvement Technique: Earth ReinforcementGround Improvement Technique: Earth Reinforcement
Ground Improvement Technique: Earth Reinforcement
 

A Comparative Study of Recommendation System Using Web Usage Mining

  • 1. Scientific Journal Impact Factor (SJIF): 1.711 International Journal of Modern Trends in Engineering and Research www.ijmter.com @IJMTER-2014, All rights Reserved 11 e-ISSN: 2349-9745 p-ISSN: 2393-8161 A Comparative Study of Recommendation System Using Web Usage Mining Mr. D. C. Karthick Kumar1 , Mr. R. Lokesh Kumar2 , Dr. P. Sengottuvelan3 PG Scholar1 , Assistant Professor2 , Associate Professor3 Department of IT1,2,3 Bannari Amman Institute of Technology1,2,3 Abstract—Web Mining is one of the Developing field in research. Exact custom of the Web is to get the beneficial material in the sites. To reduce the work time of user the Web Usage Mining (WUM) technique is introduced. In this Technique use Web Page recommendation for the Web request from the user. For the recommendation system in Web Usage Mining (WUM) variousauthor has introduce different Algorithm and technique to improve the user interest in surfing the Web. Web log files are used todefine the user interest and there next recommend page to view.The data stored in the web log file consist of large amount oferoded, incomplete, and unnecessary information. So, the Web log files have to pre- process, customize, and to clean the data. In this paper we will survey different recommendation technique to identify the issues in web surfing and to improve web usagemining (WUM) pre-processing for pattern mining and analysis. Keywords-Web Mining, Web Usage Mining, Pre-processing, Recommendation technique, Pattern Mining I. INTRODUCTION With the rapid growth of the Web, it becomes more and more difficult for Web users to find useful information. In particular, a Web user often wanders aimless on the Web without visiting pages of his/her interests, or spends a long time to find the expected information. Web page recommendation is thus proposed to address this problem. It aims to understand the users’ behaviors, and guide users to visit pages of their interests at a specific time. An essential task of Web page recommendation is to understand users’ navigation behaviors from their Web usage data, and devise a model to predict what pages the users are more likely to visit at the next step. Web Usage Mining is the submission of data mining techniques to determine motivating usage shapes from Web data in order to know and well aid the needs of Web-based tenders. Usage data seizures the self or cause of web users sideways with their browsing performance at a web site. Web usage mining that one can be confidential extra dependent on the kind of tradition data measured: Web Server Data: The user logs are collected by the web server. It contains various unwanted data like IP address, page reference and access time. Application Server Data: Profitable bid aides have substantial landscapes to qualify e-commerce tenders to be built on top of them with slight power. A key article is the talent to roadway countless breeds of commerce dealings and records them in submission attendant woods.
  • 2. International Journal of Modern Trends in Engineering and Research (IJMTER) Volume 01, Issue 05, [November - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161 @IJMTER-2014, All rights Reserved 12 Application Level Data: Novel kinds of actions can be well-defined in a claim, and sorting can be bowed on for them thus producing antiquities of these especially definite events. It must be famous, but, that many end claims require a blend of one or more of the procedures useful in the groupings overhead. Studies correlated to work are afraid with two expanses: constraint-based data mining systems applied in Web Usage Mining and settled software riggings. Figure 1. Classification of Web Usage Mining Agent log file is used to record the information about user’s browser, browsers version and operating system. Different versions of different users browsing history are very helpful for designer and web site changes are made accordingly. Access log file is one of the major web log server, it will record each click, hits and access of the users for capturing information about user, can use number of attributes. Table 1.describes different attributes of access log file along with their explanation. Table 1. Attributes of Log File and Description Attributes Description Client IP Client machine IP Address Client name Client Name if required by server otherwise hyphen(-) Date Date is recorded when User Made access and transfer. Time Time of transformation is recorded
  • 3. International Journal of Modern Trends in Engineering and Research (IJMTER) Volume 01, Issue 05, [November - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161 @IJMTER-2014, All rights Reserved 13 Server site name Internet service name as appeared on client machine Server Computer name Server name Server IP Server IP provided by internet service provider Server port Server port configured for data transformation Client server URL stream Targeted default web page of web site Client server URL Query Client query which starts after “?” Server client status Status code returned by server link Server client Bytes Number of bytes sent by server to client Client server Bytes Number of bytes received by client Client server methods Client method or model of request can be Get, POST or HEAD Time taken How much spent by client to perform an action Client server version Protocol version like HTTP Client server host Host header name User agent Browser type that client used Cookies Contents of cookies Referrer Link from where client jump to this site Server client win32 status Windows status code II. MOTIVATION Preprocessing is the essential step for web usage mining. Preprocessing phase will occur after creating a web log file. A web log file is an input of preprocessing phase of web usage mining. Web log file is large in large, contains number of raw datas, images, audio/ video files. The main aim of preprocessing is to make an unstructures/semistructured web log file data into a structured form by
  • 4. International Journal of Modern Trends in Engineering and Research (IJMTER) Volume 01, Issue 05, [November - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161 @IJMTER-2014, All rights Reserved 14 eliminating the unwanted datas from log. We came to a conclusion that preprocessing is very important in web usage mining. Preprocessing steps increases the quality of data and improves the effictiveness and efficiency of the other steps in web usage mining like pattern discovery and pattern analyais. Figure 2. Pre-processing Technique Process III. LITERATURE STUDY The focus of literature review is to study, compare and contrast the available Recommendation system techniques. Due to large amount of irrelevant entries in the web log file, the original log file cannot be directly used in WUM process. Therefore, the pre-processing of web log file becomes more essential. Authors Xiaogang ,Wang Yan Bai and Yue Li from Wuhan University of Science and Engineering in china has made a work fashionable An Material Recapture Way Based On Serial Admittance Patterns they has proposed an intelligent web recommender organization known as WAPPS based on successive web admission decorations. In that the successive shape mining algorithm CS-mine is used to mine everyday sequential web access patterns. The mined shapes are stored in the Pattern-tree, which is then used for similar and creating web links for operational approvals. The projected system has completed upright act with high pleasure and applicability. In every Sequential Pattern Mining the web log should containdate-timestamp, client IP address, user ID, requested URL, and HTTP status code. With a various sample database of web access sequences the support threshold MinSup = 3 in sample database. In pattern-tree construction algorithm is based on the set of sequential web access patterns mined by the sequential pattern mining component using cs-mine. Recommendation Rules Generation Algorithm is also used for shorter matching paths usually have lower accuracy in the Web log files. With various formula they have produce the complexity Analysis. Authors Ralf Krestel and Peter Fankhauser from Germany made work in topic Language Models and Topic Models for Personalizing Tag Recommendation in thisthey introduce an approach to
  • 5. International Journal of Modern Trends in Engineering and Research (IJMTER) Volume 01, Issue 05, [November - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161 @IJMTER-2014, All rights Reserved 15 personalized tag recommendation that combines a probabilistic model of tags from the resource with tags from the user. As models we investigate simple language models as well as Latent Dirichlet Allocation. Extensive experiments on a real world dataset crawled from a big tagging system show that personalization improves tag recommendation, and our approach significantly outperforms state-of-the-art approaches. In their work they have travelled user-centered and source- centered methods for modified tag endorsement. We likened and active a linguistic presentation tactic and a method based on Latent Dirichlet Distribution. We additional- more methodically examined the use of linguistic replicas and LDA for tag endorsement presentation that simple linguistic replicas constructed from users and possessions harvest modest presentation while overwhelming only a portion of the computational costs likened to more cultured procedures. We presented that the grouping of together approaches (LDA and LM) custom-made to manipulators and assets outstrip state-of-the-art tag recommendation procedures with admiration to a comprehensive diversity of performance metrics. It would also be stimulating to see whether the performance of the procedures variations after functional to a snap or video classification structure in its place of a labelling scheme for web folios. Finally, we also design to explore how added circumstantial information such as time, position, and existing mission can be charity to additional advance tag endorsement. Authors Shiva Nadi, Mohamad Saraee, Mohamad Davarpanah-Jazi from Islamic Azad University of Najafabad in Iran has made a research on a topic A Fuzzy Recommender System for Dynamic Prediction of User's Behavior with various algorithm and fuzzy clustering techniques. They make the preprocessing in the raw log files whichhave useful information about access of all users to a specific website. They have integrated web content mining to web usage mining for finding users interesting rules with Fuzzy clustering techniques. Knowledge discovery Process is created with various steps has follow they are as preprocessing for remove the unwanted images and data in log files by data cleaning process. In Web content mining they use the document clustering to group the pages in content based clusters. Integrate the log files with website pages to give recommendation for on-line user’s for the first time in sites.Recommendation Process is provide if a new user starts a transaction, our model matches the new user with the most similar user clusters and provides suitable recommendations to him as Support Identification. In Match Score Identification has match score calculation defines highest match user cluster for active user also it give us a list of corresponding user clusters from the highest match score down to lowest match score. For the Experiment they have use the log files from Information and Communication Technology Center of Isfahan municipality in Iran (F A VA) for IP address 80.191.136.6 for a period of one week. They have produced an effective result with the simulation experiment. Authors Mozhgan Azimpour-Kivi form Sharif University of Technology in Iran and Reza Azmi from Alzahra University in Iran has made work with A Webpage Similarity Measure for Web Sessions Clustering Using Sequence Alignment. Web sessions clustering is a process of web usage mining task that aims to group web sessions with similar trends and usage patterns into clusters. This process is used to improve the web management and to provide the efficient web recommender system for the user. The proposed techniques similarity measure for comparing web sessions with sequential Alignment method and use two web sessions based on the time a user spends on a webpage, and also the frequency of visit of each page within the session. In their work they have compare URLs for web pages similarity and also the importance of the user results in web similarity sessions. In URL they not consider the content of web page only path leading to a
  • 6. International Journal of Modern Trends in Engineering and Research (IJMTER) Volume 01, Issue 05, [November - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161 @IJMTER-2014, All rights Reserved 16 web page by making the tree structure to determine the length of the longest token string among two websites. In Web Page Similarity Based on the Importance to the User is based on the interesting of the user by visiting the pages within the time session. It is calculated and compare with two pages to provide recommendation for the next users. With the similarity of web session they apply the Sequence Alignment method with the various formulas they have calculate similarity between two URLs in the webpages used by interesting user. In the feature we have a better estimation of the similarity of two web pages, we can use other methods proposed for web content mining such as Information Retrieval or semantic web approaches. Furthermore, for having a more general evaluation, we can use a larger collection of web sessions data and apply different clustering algorithms on these data. Table 2: Various Recommendation Systems and Techniques Author(s) Name Name of Paper Year of Publication Technique Used Xiaogang Wang, Yan Bai, Yue Li An Information Retrieval Method Based On Sequential Access Patterns 2010 Sequential Pattern Mining Ralf Krestel, Peter Frankhauser Language Models & Topic Models for Personalizing Tag Recommendation 2010 Language Models and Latent Dirichlet Allocation Shiva Nadi, Mohamad Saraee, Mohamad Davarpanah-Jazi A Fuzzy Recommender System for Dynamic Prediction of User's Behavior 2010 Fuzzy Clustering Technique MozhganAzimpo ur-Kivi, Reza Azmi A Webpage Similarity Measure for Web Sessions Clustering Using Sequence Alignment 2011 Sequence Alignment Method IV. CONCLUSION AND FUTURE WORK Preprocessing of web log file is first necessary and important process for web usage mining. Cleaned data after preprocessing is solid base for pattern mining and pattern analysis. Quality of pattern mining and pattern analysis is fully dependent on preprocessing process. In this survey, we summarized the existing web log preprocessing techniques and concluded some results. Most authentic source for web usage mining considered Server log file. So it must be standardized and needs to be updated to capture user access data. Some of preprocessing techniques are applied but we can use less or even ignored preprocessing techniques to improve the quality of preprocessed data. For future work we should explore preprocessing techniques and use them with the combination of existing techniques to make the whole process more robust. New practices can offer the user to investigate the log file at changed level of thought such as user assemblies to advantage improved considerate of log file we need ordered huddling by using wished-for bundling system.
  • 7. International Journal of Modern Trends in Engineering and Research (IJMTER) Volume 01, Issue 05, [November - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161 @IJMTER-2014, All rights Reserved 17 REFERENCES [1] M. Eirinaki and M. Vazirgiannis, “Web mining for web personalization”, ACM Transactions on Internet Technology, Vol. 3,No. 1, 2003, pp. 1-27. [2] J. Konstan, B. Miller, D. Maltz, J. Herlocker, L. Gordon, and J. Riedl,“GroupLens: applying collaborative filtering to usenet news”, Communications of the ACM, 40(3), 1997, pp. 77-87. [3] T. Joachims, D. Freitag, and T. Mitchel, “WebWatcher: a tour guide for the World Wide Web”, Proc. of the 5th International Joint Conference on AI, Japan, 1997, pp. 770-775. [4] B. Mobasher, H. Dai, T. Luo and M. Nakagawa, “Effective personalization based on association rule discovery from web usagedata”, Proc. of the 3rd ACM Workshop on Web Information and Data Management (WIDM01), 2001. [5] R. Cooley, B. Mobasher, and J. Srivastava, “Data preparation for mining World Wide Web browsing patterns”, Journal of Knowledge and Information Systems, Vol. 1, No. 1. 1999. [6] H. Halpin, V. Robu, and H. Shepherd, “The complex dynamics of collaborative tagging,” in Proceedings of the 16th International Conference on World Wide Web (WWW 2007). ACM, 2007, pp. 211–220. [7] S. A. Golder and B. A. Huberman, “The structure of collaborative tagging systems,” CoRR, vol. abs/cs/0508082, 2005. [8] D. Bollen and H. Halpin, “An experimental analysis of suggestions in collaborative tagging,” in 2009 IEEE/WIC/ACM International Conferenceon Web Intelligence, WI 2009, Milan, Italy, 15-18 September 2009,Main Conference Proceedings. IEEE, 2009, pp. 108–115. [9] M. P. Singh, "Web Usage Mining and Personalization" , a chapter in Practical Handbook of Internet Computing, Munindar P. Singh (ed.), CRC Press, 2005. [10]G. Castellano, A M. Fanelli, C. Mencar and M. Alessandra Torsello, "Similarity-based Fuzzy clustering for user profiling", IEEEIWICIACM International Conferences on Web Intelligence and Intelligent Agent Technology - Workshops, 2007. [11]G. Castellano, AM. Fanelli, P. Plantamura, M.A Torsello, A Neuro-Fuzzy Strategy for Web Personalization, Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, 2008 [12]VA Koutsonikola and AI. akali, 2009. A fuzzy bi clustering approach to correlate web users nad web pages, IntI.Knowledge and Web Intelligence, VoU, Nos.1/2, pp. 3- 23 [13]D.H.Kraft, J. Chen, M. Bautista, MJ., and MA Vila, "Textual Information Retrieval with User Profiles Using Fuzzy Clustering and," Intelligent Exploration of the Web,Heidelberg, Germany: Physica-Verlag, 2002. [14]S. Taherizadeh and N. Moghadam, 2009. Integrating web content mining into web usage mining for finding patterns and predicting users behaviors, International Journal of Information Science and Management, Vol, 7, No.1, pp.51-66 [15]T. Hussain, S. Asghar, and S. Fong, “A hierarchical cluster based preprocessing methodology for Web Usage Mining,” in 6th International Conference on Advanced Information Management and Service (IMS), 2010, pp. 472-477. [16]P. J. Rousseeuw, “Silhouettes: A graphical aid to the interpretation and validation of cluster analysis,” Journal of Computational and Applied Mathematics, vol. 20, pp. 53-65, Nov. 1987.