2. A SURVEY ON WEB USAGE MINING
TECHNIQUES
Mr. Abdul Rahaman wahab sait
Lecturer, Shaqra University,
Kingdom of Saudi Arabia
(Research Scholar, Alagappa University,India)
rahamaan@gmail.com
Dr. Meyappan
Professor,
Dept. of Computer Science and Eng.,
Alagappa University, India
meyslotus@yahoo.com
3. INTRODUCTION
• Internet becomes a popular media for the
business people.
• Today millions of domains are exist in the
internet. This result clearly shows the growth
of the internet.
• web is also a database, it is a distributed
database, data are hidden, and it means data
are stored in deep web
5. INTRODUCTION
• The Figure shows the graphical representation
of first 16 countries having more domains in
the world.
• The data has been taken from the site
www.webhosting.info as on date 25.2.2013.
• E-business is the electronic version of business
which runs on internet. Buying and selling
activities are happening through internet
6. INTRODUCTION
• Competition is the constant word in the
world, which gives new idea for a person to
present himself as unique.
• In this scenario, E-business needs the help of
data mining techniques to promote the buying
and selling activities.
• Web mining techniques are very useful to
determine the customer behavior from the
huge pool of web data.
7. INTRODUCTION
• web usage mining (WUM)
WUM is used to discover interest patterns which
can be applied to many real world problems
like improving the presentation of website,
better understanding of the users behavior
and product recommendation.
8. INTRODUCTION
•THREE MAJOR CATEGORIES OF PATTERN
EXTRACTION:-
ASSOCIATION RULES
CLUSTERING
CLASSIFICATION
•In this survey we have presented WUM
techniques implemented using clustering and
classification.
9. Role of WUM
• WUM helps to determine frequent access
behavior of the users, needed links can be
identified to improve the overall performance
of future access.
• It provides detailed feedback on user behavior
providing the website designer information on
which to base redesign decisions.
• It can be used to do statistical research about
the users / customers for the site.
10. Role of WUM
• It can be used for performance evaluation for
the company / organization.
• It is usually an automated process whereby
web server collect and report user access
patterns in server access logs.
11. An Illustration of Pattern
Extraction
• The above figure illustrates the method of WUM.
• Data collected from web log data are pre
processed and if it is necessary it will be
transformed into correct form and given as an
input to the pattern extraction Methods.
12. Web Logs
• Web log are plain text files which records activity of the
user. The above figure shows the types of web log.
• Web server – The common place to store the usage data
and the primary source of data for web usage mining that
are collected when users access web pages.
13. Web Logs
• Web proxy server – Primarily used for security
purpose. A Web proxy acts as an intermediate
level of caching between client browsers and
Web servers.
• Client Log - When user surfs the website,
some data will be stored about the activity
within the client for the future use of the
client or by the web server.
14. SURVEY - WUM Algorithms
• Clustering is the process of grouping
observations of similar kinds into smaller
groups within the larger population.
• Clustering is the subject of dynamic research in
different fields such as statistics, pattern
recognition and machine learning.
15. SURVEY - WUM Algorithms
• Clustering is unsupervised learning of a hidden data concept.
• The following table shows the research based on the
clustering algorithm.
Research based on Clustering Approach
16. SURVEY - WUM Algorithms
• Classification is one of most important data
mining function that assigns items in a
collection to target categories or classes.
• Classification is a supervised learning method
and processing and evaluation will depend
upon the target.
• The following table shows the research based
on classification algorithms
17. SURVEY - WUM Algorithms
Research based on Classification Approach
18. CONCLUSION
• Clustering and Classification techniques are
the foundation for making website with
intelligence.
• The main goal of this paper is to study the
existing techniques implemented with
clustering and classification algorithms.
• The further study will lead to do a new
automated web usage mining technique for
web page classification and pattern
extraction.