SlideShare a Scribd company logo
Integrated Intelligent Research (IIR) International Journal of Web Technology
Volume: 01 Issue: 02 December 2012 Page No.20-24
ISSN: 2278-2389
20
Personalizing Web Directories by Constructing User
Community Models
R.Ayesha Kani
Assistant Professor,Department of Computer Applications, Dhanalakshmi College of Engineering
Abstract- Web usage data was analysed and useful knowledge
was extracted, previously by employing machine learning
methods, particularly clustering techniques. This system is
manually constructed; hence it has limited topic coverage. So, I
propose a “Web Community Directory” that collects the data of a
usage mining process, which is collected at the proxy servers of
the central service on the Web. For the construction of the
community models, a data mining algorithm, called “Community
Directory Miner” is extended and used for Content-Based
Personalised Search (CBPS). The Web directory is viewed as a
concept hierarchy which is generated by a content-based
document clustering method. The construction of Web
community directories is seen here as the end result of a usage
mining process the data collected at the proxy servers of the
central service on the Web.
Keywords: machine learning, clustering, web community
directory, web usage mining.
I.INTRODUCTION
1.1 Project Overview:
AT its current state, the web has not achieved its goal of
providing easy access to online information. As its size is
increasing, the abundance of available information on the web
causes the frustrating phenomenon of “information overload” to
Web users. Organization of the web content into thematic
hierarchies is an attempt to alleviate the problem. These
hierarchies are known as web directories and correspond to
listings of topics which are organized and overseen by
humans.As the size of the web is increasing at a galloping pace;
the information overload of its users emerges as one of the web’s
major shortcomings. The concept of web community directories,
as a means of personalizing services on the web, and presents a
novel methodology for the construction of these directories by
usage mining methods. I am applying personalization to web
directories. In this context, the web directory is viewed as a
thematic hierarchy and personalization is realized by
constructing user community models on the basis of usage data. I
introduce a novel methodology that combines the users’
browsing behaviour with thematic information from the Web
directories. Following this methodology, I enhance the clustering
and probabilistic approaches presented in previous work and also
present a new algorithm that combines these two approaches.
The proposed personalization methodology is evaluated both on
a specialized artificial and a general-purpose Web directory,
indicating its potential value to the web user. In particular, I
focus on the construction of usable web directories that model
the interests of groups of users, known as user communities. The
construction of user community models, i.e., usage patterns
representing the browsing preferences of the community
members, with the aid of web usage mining. More specifically, I
present a knowledge discovery framework for constructing
community-specific web directories. Community web directories
exemplify a new objective of web personalization. One of the
challenges is the analysis of large data sets in order to identify
community behavior. Moreover and apart from the heavy traffic
expected at a central node, such as an ISP proxy server, a
peculiarity of the data is that they do not correspond to hits
within the boundaries of a site, but record outgoing traffic to the
whole of the web. This fact leads to increased dimensionality and
semantic incoherence of the data, i.e., the web pages that have
been accessed.
II. RELATED WORK
Web usage mining has been used extensively for web
personalization. A number of personalized services employ
machine learning methods, particularly clustering techniques, to
analyze web usage data and extract useful knowledge for the
recommendation of links to follow within a site, or for the
customization of web sites to the preferences of the users. PLSA
has been used in the context of Collaborative Filtering and Web
Usage Mining. On the other hand, a number of studies exploit
web directories to achieve a form of personalization. Users build
their profiles by specifying a set of categories from the ODP
hierarchy. Automatic profile construction is proposed. The user
profiles linked to categories of the directory are used typically for
personalized web search, while the directory itself is not
personalized.Our work differs from the above cited approaches
in several aspects. First, instead of using the Web directory for
personalization, it personalizes the directory itself. Compared to
existing approaches to directory personalization, it focuses on
aggregate or collaborative user models such as user communities,
rather than content selection for single user.
III. OBJECTIVES
 To provide easy access to online information.
 To decrease information overload.
 To create a knowledge discovery framework for the
construction of community and applying the personalization to
web directories for the purpose of to extract the data.
IV. EXISTING SYSTEM
A number of personalized services employ machine learning
methods, particularly clustering techniques, to analyze web usage
Integrated Intelligent Research (IIR) International Journal of Web Technology
Volume: 01 Issue: 02 December 2012 Page No.20-24
ISSN: 2278-2389
21
data and extract useful knowledge for the recommendation of
links to follow within a site, or for the customization of Web
sites to the preferences of the users.PLSA was used to construct a
model-based framework that describes user ratings. Latent
factors Ire employed to model unobservable motives, which are
then used to identify similar users and items, in order to predict
subsequent user ratings. In ODP, the user profiles linked to
categories of the directory are used typically for personalized
web search, while the directory itself is not personalized.
An initial approach to automate this process was the montage
system, which was used to create personalized portals, consisting
primarily of links to the web pages that a particular user has
visited, while also organizing the links into thematic categories
according to the ODP directory.
4.1 Existing system disadvantages
• Existing approach was the problem of information overload
is the personalization of the services on the Web.
• Difficult to navigate information to the particular user.
• Manually constructed, hence limited topic coverage.
• Difficult to navigate, due to their size and complexity.
V. PROPOSED SYSTEM
I propose a knowledge discovery framework for building web
directories according to the preferences of user communities.
User community models take the form of thematic hierarchies
and are constructed by employing clustering and probabilistic
learning approaches.
Community web directories are more appropriate than personal
user models for personalization across Web sites, since they
aggregate statistics for many users under a predefined thematic
taxonomy, thus making it possible to handle a large amount of
data, residing in a sparse dimensional space.I address the
problem of “local overload”. I achieve this by combining
thematic with usage information to model the user communities.
I applied our methodology to the ODP directory, as well as to an
artificial web directory, which was generated by clustering web
pages that appear in the access log of a web proxy.
5.1 Methodology
 I describe the pattern discovery methodology that I propose
for the construction of the community web directories.
 This methodology aims at the selection of categories of the
Web directory that satisfy the criteria mentioned in the previous
sections, i.e., community as well as objective informativeness.
 The selected categories are used to construct the sub graph
of the community web directory.
 The input to the pattern discovery algorithms is the user
sessions, which have been mapped to thematic user sessions.
 I note that there is a many-to-one mapping of the web pages
to the leaf categories of the web directory.
5.2Architecture
Proxy server collects all the information’s from world wide web.
These information’s are stored in logs. Logs are activation of
record. These information’s are passed to Data collection and
pre-processing unit for the purpose of cleaning and to get
relevant details only. Then Clustering Algorithm is applied to
those information to form the similar characterized information’s
are grouped together. Page categorization is the output of
clustering algorithm. Pattern Discovery is applied to those page
categories to find the interesting patterns. Based on these
interesting patterns web community directory is created for
human. Finally proxy server processes the request from human
and gives the relevant information to the human.
VI. SYSTEM IMPLEMENTATION
6.1 Construction of Web directory hierarchies.
6.2 Authentication and Authorization.
6.3 Personalization methodology.
6.4 Manipulation of personalized data.
6.1 Construction of Web directory hierarchies
Search the directory categories and pages to find what you need.
An online guide to the world wide web, users who visit and
evaluate Websites and then organize them into subject-based
categories and subcategories. When categorizing websites, user
directory, considers a number of factors, such as commercial
versus non-commercial and regional versus global. All listings
are organized within the categories displayed on the directory’s
front page. You can browse through the directory by looking
through the category list on the left side of the directory
homepage and clicking various category links. A directory is a
subject-tree style catalogue that organizes the Web into major
topics, including Arts, Business and Economy, Computers and
Internet, Education, Entertainment, Government, Health, News,
Recreation, Reference, Regional, Science, Social Science,
Society and Culture. Under each of these topics is a list of
subtopics and under each of those is another list, and another,
and so on, moving from the more general to the more specific.
6.2 Authentication and Authorization
Login is the first module of the project. The login page helps us
to navigate inside the Website. The login page is found as a link
in the home page of personalized Web directory. After logging
in, the user will proceed with search engine. This login consists
of two sub modules namely. The registration page consists of
various fields like username, password, email id and mobile
number. These data are stored in the database and these data can
be retrieved from the database when the login page can be
accessed for the new users to navigate throughout the Website.
The username should be of alphabetic characters. Password can
Integrated Intelligent Research (IIR) International Journal of Web Technology
Volume: 01 Issue: 02 December 2012 Page No.20-24
ISSN: 2278-2389
22
be of any character of user’s type. E-Mail id must be a valid e-
mail of any prototype Website. A valid ten digit mobile number
is to be provided in the next mobile number field to send the mail
sequence number to corresponding mobile when any AFTN
message is received.User Authentication is the main page for
logging inside the Web directory. This link is provided in the
home page of the Web directory. It consists of the username and
the password field through which the user can login. The system
checks with the database details already registered for the new
user when the submit button is clicked. It thrives the user to the
success page if the details provided are correct. Else the user is
thrived to the error page to make user provide the correct details
already registered. This page also allows the user to transact to
the registration page in case of undefined new user.
6.3 Personalization Methodology
Personalized search has been around for a while for signed in
users who have web history enabled. Personalized search most
important module in this project. After authentication the user
allowed to search their data. Their searches are personalized.
(i.e.) the user’s search is maintained as history. Their web
searches are thus personalized. They can view it later. The user
can also clear their history, if they don’t want. This allows web
directory to fine-tune your search results based on past searches
and on the sites you've clicked on in the past. This is how web
directory tries to optimize searches for you when your search
terms have more than one meaning.
OWL is a language for processing web information. Creating
OWL file is the third module of this project. In this module,
OWL files are dynamically created for each web usage data
insertion. During insertion the exact information of that
corresponding usage data or keywords and that information’s are
stored in OWL file at runtime. It contains the exact description of
the keyword and their relationships. OWL files also contains
Short description, link etc about the keyword. From here only
search engine retrieves the data. This file acts as a database for
retrieving results. The existing categories are stored in history the
categories which are not in bookmark are newly inserted by user
here. Moreover additional information for the existing categories
are also added the information refers here is definition, exact
information, url etc. When you enter a search in the Web
directory engine, only the category you are currently in will be
searched. This can be particularly useful in restricting your
search to a particular topic or domain. For example, a search over
the entire Web for 'lions' might return pages about lions (the
animal), Lions (the football team), Lions (the public service
organization), or any number of other subjects. By searching for
"Lions" within the category "Sports > Football, American >
Professional >", you will see only results related to the Detroit
Lions football team.
6.4 Manipulation of personalized data
The part for signed user is “you will not be able to access or
manage your Search History”. If you wanted web directory to
save my search history, why this required to have personalized
search rammed down throat, with an all or nothing clause. I
could understand if it was the other way around and they said
you can’t have personalized searches without storing your
browsing history, as there’s an obvious data prerequisite. It
would seem to me that web directory is trying to force the
adoption of personalized search telling you that you can’t have
your search history saved without getting personalized results.
EXPERIMENTALSETUP
1.REGISTRATION
2.HOME PAGE
Integrated Intelligent Research (IIR) International Journal of Web Technology
Volume: 01 Issue: 02 December 2012 Page No.20-24
ISSN: 2278-2389
23
3.ADMIN MAINTAINENCE
4.SEARCHING
Integrated Intelligent Research (IIR) International Journal of Web Technology
Volume: 01 Issue: 02 December 2012 Page No.20-24
ISSN: 2278-2389
24
5. HISTORY MAINTENANCE
6.LOGOUT
VII.CONCLUSION
This project advocates the concept of a web directory, as a web
directory that specializes to the needs and interests of particular
user communities. Furthermore, it presents the complete
methodology for the construction of such directories with the aid
of machine learning methods. User community models take the
form of thematic hierarchies and are constructed by employing
clustering and probabilistic learning approaches. methodology to
the ODP directory, as well as to an artificial web directory,
which was generated by clustering web pages that appear in the
access log of a web proxy. For the discovery of the community
models, I introduced a new criterion that combines the a priori
thematic in formativeness of the web directory categories with
the level of interest observed in the usage data and the Open
Directory Project (ODP) (dmoz.org), allows users to find Web
sites related to the topic they are interested in, by starting with
broad categories and gradually narrowing down, choosing the
category most related to their interests. However, the information
for the topic that a user is seeking might reside very deep inside
the directory.
REFERENCES
[1] J. Sivic and A. Zisserman, “Video Google: A Text Retrieval Approach to
Object Matching in Videos,” Proc. Int’l Conf. Computer Vision (ICCV), vol.
2, pp. 1470-1477, 2003.
[2] L. Chen, M.T. O ¨ zsu, and V. Oria, “Robust and Fast Similarity Search for
Moving Object Trajectories,” Proc. ACM SIGMOD, pp. 491-502, 2005.
[3] R. Weber, H. Schek, and S. Blott, “A Quantitative Analysis and Performance
Study for Similarity Search Methods in High Dimensional Spaces,” Proc.
Int’l Conf. Very Large Data Bases (VLDB), pp. 194-205, 1998.
[4] F. Liu, C. Yu, and W. Meng, “Personalized Web Search by Mapping User
Queries to Categories,” Proc. Int’l Conf. Information and Knowledge
Management (CIKM), 2002.
[5] Open Directory Project, http://www.dmoz.org/, 2009.
[6] J. Sivic and A. Zisserman, “Video Google: A Text Retrieval Approach to
Object Matching in Videos,” Proc. Int’l Conf. Computer Vision (ICCV), vol.
2, pp. 1470-1477, 2003.
[7] Polsani, P.R.: Use and Abuse of Reusable Learning Objects. Journal of
Digital Information, Volume 3 Issue 4, Article No. 164, 2003-02-19
[8] Johnston, P.: It’s all in the context. foundations’ metadata, middleware, e-
learning,2006.
http://efoundations.typepad.com/efoundations/2006/12/its_all_in_the_.html.
Last accessed 23 Feb 2007
[9] Eduserv funded project MURLLO. Available at
http://www.elanguages.ac.uk/murllo/
[10]M. Bilenko and R.J. Mooney, “Adaptive Duplicate Detection Using
Learnable String Similarity Measures,” Proc. ACM SIGKDD, pp. 39-48,
2003.

More Related Content

Similar to Personalizing Web Directories by Constructing User Community Models

Web Page Recommendation Using Web Mining
Web Page Recommendation Using Web MiningWeb Page Recommendation Using Web Mining
Web Page Recommendation Using Web Mining
IJERA Editor
 
BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...
BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...
BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...
ijdkp
 
IRJET - Re-Ranking of Google Search Results
IRJET - Re-Ranking of Google Search ResultsIRJET - Re-Ranking of Google Search Results
IRJET - Re-Ranking of Google Search Results
IRJET Journal
 
H0314450
H0314450H0314450
H0314450
iosrjournals
 
Semantically enriched web usage mining for predicting user future movements
Semantically enriched web usage mining for predicting user future movementsSemantically enriched web usage mining for predicting user future movements
Semantically enriched web usage mining for predicting user future movements
IJwest
 
International conference On Computer Science And technology
International conference On Computer Science And technologyInternational conference On Computer Science And technology
International conference On Computer Science And technology
anchalsinghdm
 
F43033234
F43033234F43033234
F43033234
IJERA Editor
 
Automatic recommendation for online users using web usage mining
Automatic recommendation for online users using web usage miningAutomatic recommendation for online users using web usage mining
Automatic recommendation for online users using web usage mining
IJMIT JOURNAL
 
Automatic Recommendation for Online Users Using Web Usage Mining
Automatic Recommendation for Online Users Using Web Usage Mining Automatic Recommendation for Online Users Using Web Usage Mining
Automatic Recommendation for Online Users Using Web Usage Mining
IJMIT JOURNAL
 
An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...
An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...
An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...
IJSRD
 
2000-08.doc
2000-08.doc2000-08.doc
2000-08.doc
butest
 
2000-08.doc
2000-08.doc2000-08.doc
2000-08.doc
butest
 
Mathematical Programming Approach to Improve Website Structure for Effective ...
Mathematical Programming Approach to Improve Website Structure for Effective ...Mathematical Programming Approach to Improve Website Structure for Effective ...
Mathematical Programming Approach to Improve Website Structure for Effective ...
IOSR Journals
 
F017154347
F017154347F017154347
F017154347
IOSR Journals
 
WEBMINING_SOWMYAJYOTHI.pdf
WEBMINING_SOWMYAJYOTHI.pdfWEBMINING_SOWMYAJYOTHI.pdf
WEBMINING_SOWMYAJYOTHI.pdf
SowmyaJyothi3
 
Recommendation generation by integrating sequential
Recommendation generation by integrating sequentialRecommendation generation by integrating sequential
Recommendation generation by integrating sequential
eSAT Publishing House
 
Recommendation generation by integrating sequential pattern mining and semantics
Recommendation generation by integrating sequential pattern mining and semanticsRecommendation generation by integrating sequential pattern mining and semantics
Recommendation generation by integrating sequential pattern mining and semantics
eSAT Journals
 
Minning www
Minning wwwMinning www
Minning www
Sonali Parab
 
Webmining ppt
Webmining pptWebmining ppt
Webmining ppt
kiransatyawada
 
WEB MINING – A CATALYST FOR E-BUSINESS
WEB MINING – A CATALYST FOR E-BUSINESSWEB MINING – A CATALYST FOR E-BUSINESS
WEB MINING – A CATALYST FOR E-BUSINESS
acijjournal
 

Similar to Personalizing Web Directories by Constructing User Community Models (20)

Web Page Recommendation Using Web Mining
Web Page Recommendation Using Web MiningWeb Page Recommendation Using Web Mining
Web Page Recommendation Using Web Mining
 
BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...
BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...
BIDIRECTIONAL GROWTH BASED MINING AND CYCLIC BEHAVIOUR ANALYSIS OF WEB SEQUEN...
 
IRJET - Re-Ranking of Google Search Results
IRJET - Re-Ranking of Google Search ResultsIRJET - Re-Ranking of Google Search Results
IRJET - Re-Ranking of Google Search Results
 
H0314450
H0314450H0314450
H0314450
 
Semantically enriched web usage mining for predicting user future movements
Semantically enriched web usage mining for predicting user future movementsSemantically enriched web usage mining for predicting user future movements
Semantically enriched web usage mining for predicting user future movements
 
International conference On Computer Science And technology
International conference On Computer Science And technologyInternational conference On Computer Science And technology
International conference On Computer Science And technology
 
F43033234
F43033234F43033234
F43033234
 
Automatic recommendation for online users using web usage mining
Automatic recommendation for online users using web usage miningAutomatic recommendation for online users using web usage mining
Automatic recommendation for online users using web usage mining
 
Automatic Recommendation for Online Users Using Web Usage Mining
Automatic Recommendation for Online Users Using Web Usage Mining Automatic Recommendation for Online Users Using Web Usage Mining
Automatic Recommendation for Online Users Using Web Usage Mining
 
An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...
An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...
An Enhanced Approach for Detecting User's Behavior Applying Country-Wise Loca...
 
2000-08.doc
2000-08.doc2000-08.doc
2000-08.doc
 
2000-08.doc
2000-08.doc2000-08.doc
2000-08.doc
 
Mathematical Programming Approach to Improve Website Structure for Effective ...
Mathematical Programming Approach to Improve Website Structure for Effective ...Mathematical Programming Approach to Improve Website Structure for Effective ...
Mathematical Programming Approach to Improve Website Structure for Effective ...
 
F017154347
F017154347F017154347
F017154347
 
WEBMINING_SOWMYAJYOTHI.pdf
WEBMINING_SOWMYAJYOTHI.pdfWEBMINING_SOWMYAJYOTHI.pdf
WEBMINING_SOWMYAJYOTHI.pdf
 
Recommendation generation by integrating sequential
Recommendation generation by integrating sequentialRecommendation generation by integrating sequential
Recommendation generation by integrating sequential
 
Recommendation generation by integrating sequential pattern mining and semantics
Recommendation generation by integrating sequential pattern mining and semanticsRecommendation generation by integrating sequential pattern mining and semantics
Recommendation generation by integrating sequential pattern mining and semantics
 
Minning www
Minning wwwMinning www
Minning www
 
Webmining ppt
Webmining pptWebmining ppt
Webmining ppt
 
WEB MINING – A CATALYST FOR E-BUSINESS
WEB MINING – A CATALYST FOR E-BUSINESSWEB MINING – A CATALYST FOR E-BUSINESS
WEB MINING – A CATALYST FOR E-BUSINESS
 

More from ijbuiiir1

Facial Feature Recognition Using Biometrics
Facial Feature Recognition Using BiometricsFacial Feature Recognition Using Biometrics
Facial Feature Recognition Using Biometrics
ijbuiiir1
 
Partial Image Retrieval Systems in Luminance and Color Invariants : An Empiri...
Partial Image Retrieval Systems in Luminance and Color Invariants : An Empiri...Partial Image Retrieval Systems in Luminance and Color Invariants : An Empiri...
Partial Image Retrieval Systems in Luminance and Color Invariants : An Empiri...
ijbuiiir1
 
Applying Clustering Techniques for Efficient Text Mining in Twitter Data
Applying Clustering Techniques for Efficient Text Mining in Twitter DataApplying Clustering Techniques for Efficient Text Mining in Twitter Data
Applying Clustering Techniques for Efficient Text Mining in Twitter Data
ijbuiiir1
 
A Study on the Cyber-Crime and Cyber Criminals: A Global Problem
A Study on the Cyber-Crime and Cyber Criminals: A Global ProblemA Study on the Cyber-Crime and Cyber Criminals: A Global Problem
A Study on the Cyber-Crime and Cyber Criminals: A Global Problem
ijbuiiir1
 
Vehicle to Vehicle Communication of Content Downloader in Mobile
Vehicle to Vehicle Communication of Content Downloader in MobileVehicle to Vehicle Communication of Content Downloader in Mobile
Vehicle to Vehicle Communication of Content Downloader in Mobile
ijbuiiir1
 
SPOC: A Secure and Private-pressuring Opportunity Computing Framework for Mob...
SPOC: A Secure and Private-pressuring Opportunity Computing Framework for Mob...SPOC: A Secure and Private-pressuring Opportunity Computing Framework for Mob...
SPOC: A Secure and Private-pressuring Opportunity Computing Framework for Mob...
ijbuiiir1
 
A Survey on Implementation of Discrete Wavelet Transform for Image Denoising
A Survey on Implementation of Discrete Wavelet Transform for Image DenoisingA Survey on Implementation of Discrete Wavelet Transform for Image Denoising
A Survey on Implementation of Discrete Wavelet Transform for Image Denoising
ijbuiiir1
 
A Study on migrated Students and their Well - being using Amartya Sens Functi...
A Study on migrated Students and their Well - being using Amartya Sens Functi...A Study on migrated Students and their Well - being using Amartya Sens Functi...
A Study on migrated Students and their Well - being using Amartya Sens Functi...
ijbuiiir1
 
Methodologies on user Behavior Analysis and Future Request Prediction in Web ...
Methodologies on user Behavior Analysis and Future Request Prediction in Web ...Methodologies on user Behavior Analysis and Future Request Prediction in Web ...
Methodologies on user Behavior Analysis and Future Request Prediction in Web ...
ijbuiiir1
 
Innovative Analytic and Holistic Combined Face Recognition and Verification M...
Innovative Analytic and Holistic Combined Face Recognition and Verification M...Innovative Analytic and Holistic Combined Face Recognition and Verification M...
Innovative Analytic and Holistic Combined Face Recognition and Verification M...
ijbuiiir1
 
Enhancing Effective Interoperability Between Mobile Apps Using LCIM Model
Enhancing Effective Interoperability Between Mobile Apps Using LCIM ModelEnhancing Effective Interoperability Between Mobile Apps Using LCIM Model
Enhancing Effective Interoperability Between Mobile Apps Using LCIM Model
ijbuiiir1
 
Deployment of Intelligent Transport Systems Based on User Mobility to be Endo...
Deployment of Intelligent Transport Systems Based on User Mobility to be Endo...Deployment of Intelligent Transport Systems Based on User Mobility to be Endo...
Deployment of Intelligent Transport Systems Based on User Mobility to be Endo...
ijbuiiir1
 
Stock Prediction Using Artificial Neural Networks
Stock Prediction Using Artificial Neural NetworksStock Prediction Using Artificial Neural Networks
Stock Prediction Using Artificial Neural Networks
ijbuiiir1
 
Indian Language Text Representation and Categorization Using Supervised Learn...
Indian Language Text Representation and Categorization Using Supervised Learn...Indian Language Text Representation and Categorization Using Supervised Learn...
Indian Language Text Representation and Categorization Using Supervised Learn...
ijbuiiir1
 
Highly Secured Online Voting System (OVS) Over Network
Highly Secured Online Voting System (OVS) Over NetworkHighly Secured Online Voting System (OVS) Over Network
Highly Secured Online Voting System (OVS) Over Network
ijbuiiir1
 
Software Developers Performance relationship with Cognitive Load Using Statis...
Software Developers Performance relationship with Cognitive Load Using Statis...Software Developers Performance relationship with Cognitive Load Using Statis...
Software Developers Performance relationship with Cognitive Load Using Statis...
ijbuiiir1
 
Wireless Health Monitoring System Using ZigBee
Wireless Health Monitoring System Using ZigBeeWireless Health Monitoring System Using ZigBee
Wireless Health Monitoring System Using ZigBee
ijbuiiir1
 
Image Compression Using Discrete Cosine Transform & Discrete Wavelet Transform
Image Compression Using Discrete Cosine Transform & Discrete Wavelet TransformImage Compression Using Discrete Cosine Transform & Discrete Wavelet Transform
Image Compression Using Discrete Cosine Transform & Discrete Wavelet Transform
ijbuiiir1
 
Secured Cloud ERP
Secured Cloud ERPSecured Cloud ERP
Secured Cloud ERP
ijbuiiir1
 
Web Based Secure Soa
Web Based Secure SoaWeb Based Secure Soa
Web Based Secure Soa
ijbuiiir1
 

More from ijbuiiir1 (20)

Facial Feature Recognition Using Biometrics
Facial Feature Recognition Using BiometricsFacial Feature Recognition Using Biometrics
Facial Feature Recognition Using Biometrics
 
Partial Image Retrieval Systems in Luminance and Color Invariants : An Empiri...
Partial Image Retrieval Systems in Luminance and Color Invariants : An Empiri...Partial Image Retrieval Systems in Luminance and Color Invariants : An Empiri...
Partial Image Retrieval Systems in Luminance and Color Invariants : An Empiri...
 
Applying Clustering Techniques for Efficient Text Mining in Twitter Data
Applying Clustering Techniques for Efficient Text Mining in Twitter DataApplying Clustering Techniques for Efficient Text Mining in Twitter Data
Applying Clustering Techniques for Efficient Text Mining in Twitter Data
 
A Study on the Cyber-Crime and Cyber Criminals: A Global Problem
A Study on the Cyber-Crime and Cyber Criminals: A Global ProblemA Study on the Cyber-Crime and Cyber Criminals: A Global Problem
A Study on the Cyber-Crime and Cyber Criminals: A Global Problem
 
Vehicle to Vehicle Communication of Content Downloader in Mobile
Vehicle to Vehicle Communication of Content Downloader in MobileVehicle to Vehicle Communication of Content Downloader in Mobile
Vehicle to Vehicle Communication of Content Downloader in Mobile
 
SPOC: A Secure and Private-pressuring Opportunity Computing Framework for Mob...
SPOC: A Secure and Private-pressuring Opportunity Computing Framework for Mob...SPOC: A Secure and Private-pressuring Opportunity Computing Framework for Mob...
SPOC: A Secure and Private-pressuring Opportunity Computing Framework for Mob...
 
A Survey on Implementation of Discrete Wavelet Transform for Image Denoising
A Survey on Implementation of Discrete Wavelet Transform for Image DenoisingA Survey on Implementation of Discrete Wavelet Transform for Image Denoising
A Survey on Implementation of Discrete Wavelet Transform for Image Denoising
 
A Study on migrated Students and their Well - being using Amartya Sens Functi...
A Study on migrated Students and their Well - being using Amartya Sens Functi...A Study on migrated Students and their Well - being using Amartya Sens Functi...
A Study on migrated Students and their Well - being using Amartya Sens Functi...
 
Methodologies on user Behavior Analysis and Future Request Prediction in Web ...
Methodologies on user Behavior Analysis and Future Request Prediction in Web ...Methodologies on user Behavior Analysis and Future Request Prediction in Web ...
Methodologies on user Behavior Analysis and Future Request Prediction in Web ...
 
Innovative Analytic and Holistic Combined Face Recognition and Verification M...
Innovative Analytic and Holistic Combined Face Recognition and Verification M...Innovative Analytic and Holistic Combined Face Recognition and Verification M...
Innovative Analytic and Holistic Combined Face Recognition and Verification M...
 
Enhancing Effective Interoperability Between Mobile Apps Using LCIM Model
Enhancing Effective Interoperability Between Mobile Apps Using LCIM ModelEnhancing Effective Interoperability Between Mobile Apps Using LCIM Model
Enhancing Effective Interoperability Between Mobile Apps Using LCIM Model
 
Deployment of Intelligent Transport Systems Based on User Mobility to be Endo...
Deployment of Intelligent Transport Systems Based on User Mobility to be Endo...Deployment of Intelligent Transport Systems Based on User Mobility to be Endo...
Deployment of Intelligent Transport Systems Based on User Mobility to be Endo...
 
Stock Prediction Using Artificial Neural Networks
Stock Prediction Using Artificial Neural NetworksStock Prediction Using Artificial Neural Networks
Stock Prediction Using Artificial Neural Networks
 
Indian Language Text Representation and Categorization Using Supervised Learn...
Indian Language Text Representation and Categorization Using Supervised Learn...Indian Language Text Representation and Categorization Using Supervised Learn...
Indian Language Text Representation and Categorization Using Supervised Learn...
 
Highly Secured Online Voting System (OVS) Over Network
Highly Secured Online Voting System (OVS) Over NetworkHighly Secured Online Voting System (OVS) Over Network
Highly Secured Online Voting System (OVS) Over Network
 
Software Developers Performance relationship with Cognitive Load Using Statis...
Software Developers Performance relationship with Cognitive Load Using Statis...Software Developers Performance relationship with Cognitive Load Using Statis...
Software Developers Performance relationship with Cognitive Load Using Statis...
 
Wireless Health Monitoring System Using ZigBee
Wireless Health Monitoring System Using ZigBeeWireless Health Monitoring System Using ZigBee
Wireless Health Monitoring System Using ZigBee
 
Image Compression Using Discrete Cosine Transform & Discrete Wavelet Transform
Image Compression Using Discrete Cosine Transform & Discrete Wavelet TransformImage Compression Using Discrete Cosine Transform & Discrete Wavelet Transform
Image Compression Using Discrete Cosine Transform & Discrete Wavelet Transform
 
Secured Cloud ERP
Secured Cloud ERPSecured Cloud ERP
Secured Cloud ERP
 
Web Based Secure Soa
Web Based Secure SoaWeb Based Secure Soa
Web Based Secure Soa
 

Recently uploaded

International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...
gerogepatton
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
insn4465
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
jpsjournal1
 
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have oneISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
Las Vegas Warehouse
 
Null Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAMNull Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAM
Divyanshu
 
Seminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptxSeminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptx
Madan Karki
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
MDSABBIROJJAMANPAYEL
 
Welding Metallurgy Ferrous Materials.pdf
Welding Metallurgy Ferrous Materials.pdfWelding Metallurgy Ferrous Materials.pdf
Welding Metallurgy Ferrous Materials.pdf
AjmalKhan50578
 
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
Gino153088
 
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
171ticu
 
artificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptxartificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptx
GauravCar
 
Computational Engineering IITH Presentation
Computational Engineering IITH PresentationComputational Engineering IITH Presentation
Computational Engineering IITH Presentation
co23btech11018
 
Mechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdfMechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdf
21UME003TUSHARDEB
 
The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.
sachin chaurasia
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
KrishnaveniKrishnara1
 
Material for memory and display system h
Material for memory and display system hMaterial for memory and display system h
Material for memory and display system h
gowrishankartb2005
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
VICTOR MAESTRE RAMIREZ
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
Madan Karki
 
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
Yasser Mahgoub
 
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
shadow0702a
 

Recently uploaded (20)

International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
 
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have oneISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
 
Null Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAMNull Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAM
 
Seminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptxSeminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptx
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
 
Welding Metallurgy Ferrous Materials.pdf
Welding Metallurgy Ferrous Materials.pdfWelding Metallurgy Ferrous Materials.pdf
Welding Metallurgy Ferrous Materials.pdf
 
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
 
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
 
artificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptxartificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptx
 
Computational Engineering IITH Presentation
Computational Engineering IITH PresentationComputational Engineering IITH Presentation
Computational Engineering IITH Presentation
 
Mechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdfMechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdf
 
The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
 
Material for memory and display system h
Material for memory and display system hMaterial for memory and display system h
Material for memory and display system h
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
 
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
 
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
 

Personalizing Web Directories by Constructing User Community Models

  • 1. Integrated Intelligent Research (IIR) International Journal of Web Technology Volume: 01 Issue: 02 December 2012 Page No.20-24 ISSN: 2278-2389 20 Personalizing Web Directories by Constructing User Community Models R.Ayesha Kani Assistant Professor,Department of Computer Applications, Dhanalakshmi College of Engineering Abstract- Web usage data was analysed and useful knowledge was extracted, previously by employing machine learning methods, particularly clustering techniques. This system is manually constructed; hence it has limited topic coverage. So, I propose a “Web Community Directory” that collects the data of a usage mining process, which is collected at the proxy servers of the central service on the Web. For the construction of the community models, a data mining algorithm, called “Community Directory Miner” is extended and used for Content-Based Personalised Search (CBPS). The Web directory is viewed as a concept hierarchy which is generated by a content-based document clustering method. The construction of Web community directories is seen here as the end result of a usage mining process the data collected at the proxy servers of the central service on the Web. Keywords: machine learning, clustering, web community directory, web usage mining. I.INTRODUCTION 1.1 Project Overview: AT its current state, the web has not achieved its goal of providing easy access to online information. As its size is increasing, the abundance of available information on the web causes the frustrating phenomenon of “information overload” to Web users. Organization of the web content into thematic hierarchies is an attempt to alleviate the problem. These hierarchies are known as web directories and correspond to listings of topics which are organized and overseen by humans.As the size of the web is increasing at a galloping pace; the information overload of its users emerges as one of the web’s major shortcomings. The concept of web community directories, as a means of personalizing services on the web, and presents a novel methodology for the construction of these directories by usage mining methods. I am applying personalization to web directories. In this context, the web directory is viewed as a thematic hierarchy and personalization is realized by constructing user community models on the basis of usage data. I introduce a novel methodology that combines the users’ browsing behaviour with thematic information from the Web directories. Following this methodology, I enhance the clustering and probabilistic approaches presented in previous work and also present a new algorithm that combines these two approaches. The proposed personalization methodology is evaluated both on a specialized artificial and a general-purpose Web directory, indicating its potential value to the web user. In particular, I focus on the construction of usable web directories that model the interests of groups of users, known as user communities. The construction of user community models, i.e., usage patterns representing the browsing preferences of the community members, with the aid of web usage mining. More specifically, I present a knowledge discovery framework for constructing community-specific web directories. Community web directories exemplify a new objective of web personalization. One of the challenges is the analysis of large data sets in order to identify community behavior. Moreover and apart from the heavy traffic expected at a central node, such as an ISP proxy server, a peculiarity of the data is that they do not correspond to hits within the boundaries of a site, but record outgoing traffic to the whole of the web. This fact leads to increased dimensionality and semantic incoherence of the data, i.e., the web pages that have been accessed. II. RELATED WORK Web usage mining has been used extensively for web personalization. A number of personalized services employ machine learning methods, particularly clustering techniques, to analyze web usage data and extract useful knowledge for the recommendation of links to follow within a site, or for the customization of web sites to the preferences of the users. PLSA has been used in the context of Collaborative Filtering and Web Usage Mining. On the other hand, a number of studies exploit web directories to achieve a form of personalization. Users build their profiles by specifying a set of categories from the ODP hierarchy. Automatic profile construction is proposed. The user profiles linked to categories of the directory are used typically for personalized web search, while the directory itself is not personalized.Our work differs from the above cited approaches in several aspects. First, instead of using the Web directory for personalization, it personalizes the directory itself. Compared to existing approaches to directory personalization, it focuses on aggregate or collaborative user models such as user communities, rather than content selection for single user. III. OBJECTIVES  To provide easy access to online information.  To decrease information overload.  To create a knowledge discovery framework for the construction of community and applying the personalization to web directories for the purpose of to extract the data. IV. EXISTING SYSTEM A number of personalized services employ machine learning methods, particularly clustering techniques, to analyze web usage
  • 2. Integrated Intelligent Research (IIR) International Journal of Web Technology Volume: 01 Issue: 02 December 2012 Page No.20-24 ISSN: 2278-2389 21 data and extract useful knowledge for the recommendation of links to follow within a site, or for the customization of Web sites to the preferences of the users.PLSA was used to construct a model-based framework that describes user ratings. Latent factors Ire employed to model unobservable motives, which are then used to identify similar users and items, in order to predict subsequent user ratings. In ODP, the user profiles linked to categories of the directory are used typically for personalized web search, while the directory itself is not personalized. An initial approach to automate this process was the montage system, which was used to create personalized portals, consisting primarily of links to the web pages that a particular user has visited, while also organizing the links into thematic categories according to the ODP directory. 4.1 Existing system disadvantages • Existing approach was the problem of information overload is the personalization of the services on the Web. • Difficult to navigate information to the particular user. • Manually constructed, hence limited topic coverage. • Difficult to navigate, due to their size and complexity. V. PROPOSED SYSTEM I propose a knowledge discovery framework for building web directories according to the preferences of user communities. User community models take the form of thematic hierarchies and are constructed by employing clustering and probabilistic learning approaches. Community web directories are more appropriate than personal user models for personalization across Web sites, since they aggregate statistics for many users under a predefined thematic taxonomy, thus making it possible to handle a large amount of data, residing in a sparse dimensional space.I address the problem of “local overload”. I achieve this by combining thematic with usage information to model the user communities. I applied our methodology to the ODP directory, as well as to an artificial web directory, which was generated by clustering web pages that appear in the access log of a web proxy. 5.1 Methodology  I describe the pattern discovery methodology that I propose for the construction of the community web directories.  This methodology aims at the selection of categories of the Web directory that satisfy the criteria mentioned in the previous sections, i.e., community as well as objective informativeness.  The selected categories are used to construct the sub graph of the community web directory.  The input to the pattern discovery algorithms is the user sessions, which have been mapped to thematic user sessions.  I note that there is a many-to-one mapping of the web pages to the leaf categories of the web directory. 5.2Architecture Proxy server collects all the information’s from world wide web. These information’s are stored in logs. Logs are activation of record. These information’s are passed to Data collection and pre-processing unit for the purpose of cleaning and to get relevant details only. Then Clustering Algorithm is applied to those information to form the similar characterized information’s are grouped together. Page categorization is the output of clustering algorithm. Pattern Discovery is applied to those page categories to find the interesting patterns. Based on these interesting patterns web community directory is created for human. Finally proxy server processes the request from human and gives the relevant information to the human. VI. SYSTEM IMPLEMENTATION 6.1 Construction of Web directory hierarchies. 6.2 Authentication and Authorization. 6.3 Personalization methodology. 6.4 Manipulation of personalized data. 6.1 Construction of Web directory hierarchies Search the directory categories and pages to find what you need. An online guide to the world wide web, users who visit and evaluate Websites and then organize them into subject-based categories and subcategories. When categorizing websites, user directory, considers a number of factors, such as commercial versus non-commercial and regional versus global. All listings are organized within the categories displayed on the directory’s front page. You can browse through the directory by looking through the category list on the left side of the directory homepage and clicking various category links. A directory is a subject-tree style catalogue that organizes the Web into major topics, including Arts, Business and Economy, Computers and Internet, Education, Entertainment, Government, Health, News, Recreation, Reference, Regional, Science, Social Science, Society and Culture. Under each of these topics is a list of subtopics and under each of those is another list, and another, and so on, moving from the more general to the more specific. 6.2 Authentication and Authorization Login is the first module of the project. The login page helps us to navigate inside the Website. The login page is found as a link in the home page of personalized Web directory. After logging in, the user will proceed with search engine. This login consists of two sub modules namely. The registration page consists of various fields like username, password, email id and mobile number. These data are stored in the database and these data can be retrieved from the database when the login page can be accessed for the new users to navigate throughout the Website. The username should be of alphabetic characters. Password can
  • 3. Integrated Intelligent Research (IIR) International Journal of Web Technology Volume: 01 Issue: 02 December 2012 Page No.20-24 ISSN: 2278-2389 22 be of any character of user’s type. E-Mail id must be a valid e- mail of any prototype Website. A valid ten digit mobile number is to be provided in the next mobile number field to send the mail sequence number to corresponding mobile when any AFTN message is received.User Authentication is the main page for logging inside the Web directory. This link is provided in the home page of the Web directory. It consists of the username and the password field through which the user can login. The system checks with the database details already registered for the new user when the submit button is clicked. It thrives the user to the success page if the details provided are correct. Else the user is thrived to the error page to make user provide the correct details already registered. This page also allows the user to transact to the registration page in case of undefined new user. 6.3 Personalization Methodology Personalized search has been around for a while for signed in users who have web history enabled. Personalized search most important module in this project. After authentication the user allowed to search their data. Their searches are personalized. (i.e.) the user’s search is maintained as history. Their web searches are thus personalized. They can view it later. The user can also clear their history, if they don’t want. This allows web directory to fine-tune your search results based on past searches and on the sites you've clicked on in the past. This is how web directory tries to optimize searches for you when your search terms have more than one meaning. OWL is a language for processing web information. Creating OWL file is the third module of this project. In this module, OWL files are dynamically created for each web usage data insertion. During insertion the exact information of that corresponding usage data or keywords and that information’s are stored in OWL file at runtime. It contains the exact description of the keyword and their relationships. OWL files also contains Short description, link etc about the keyword. From here only search engine retrieves the data. This file acts as a database for retrieving results. The existing categories are stored in history the categories which are not in bookmark are newly inserted by user here. Moreover additional information for the existing categories are also added the information refers here is definition, exact information, url etc. When you enter a search in the Web directory engine, only the category you are currently in will be searched. This can be particularly useful in restricting your search to a particular topic or domain. For example, a search over the entire Web for 'lions' might return pages about lions (the animal), Lions (the football team), Lions (the public service organization), or any number of other subjects. By searching for "Lions" within the category "Sports > Football, American > Professional >", you will see only results related to the Detroit Lions football team. 6.4 Manipulation of personalized data The part for signed user is “you will not be able to access or manage your Search History”. If you wanted web directory to save my search history, why this required to have personalized search rammed down throat, with an all or nothing clause. I could understand if it was the other way around and they said you can’t have personalized searches without storing your browsing history, as there’s an obvious data prerequisite. It would seem to me that web directory is trying to force the adoption of personalized search telling you that you can’t have your search history saved without getting personalized results. EXPERIMENTALSETUP 1.REGISTRATION 2.HOME PAGE
  • 4. Integrated Intelligent Research (IIR) International Journal of Web Technology Volume: 01 Issue: 02 December 2012 Page No.20-24 ISSN: 2278-2389 23 3.ADMIN MAINTAINENCE 4.SEARCHING
  • 5. Integrated Intelligent Research (IIR) International Journal of Web Technology Volume: 01 Issue: 02 December 2012 Page No.20-24 ISSN: 2278-2389 24 5. HISTORY MAINTENANCE 6.LOGOUT VII.CONCLUSION This project advocates the concept of a web directory, as a web directory that specializes to the needs and interests of particular user communities. Furthermore, it presents the complete methodology for the construction of such directories with the aid of machine learning methods. User community models take the form of thematic hierarchies and are constructed by employing clustering and probabilistic learning approaches. methodology to the ODP directory, as well as to an artificial web directory, which was generated by clustering web pages that appear in the access log of a web proxy. For the discovery of the community models, I introduced a new criterion that combines the a priori thematic in formativeness of the web directory categories with the level of interest observed in the usage data and the Open Directory Project (ODP) (dmoz.org), allows users to find Web sites related to the topic they are interested in, by starting with broad categories and gradually narrowing down, choosing the category most related to their interests. However, the information for the topic that a user is seeking might reside very deep inside the directory. REFERENCES [1] J. Sivic and A. Zisserman, “Video Google: A Text Retrieval Approach to Object Matching in Videos,” Proc. Int’l Conf. Computer Vision (ICCV), vol. 2, pp. 1470-1477, 2003. [2] L. Chen, M.T. O ¨ zsu, and V. Oria, “Robust and Fast Similarity Search for Moving Object Trajectories,” Proc. ACM SIGMOD, pp. 491-502, 2005. [3] R. Weber, H. Schek, and S. Blott, “A Quantitative Analysis and Performance Study for Similarity Search Methods in High Dimensional Spaces,” Proc. Int’l Conf. Very Large Data Bases (VLDB), pp. 194-205, 1998. [4] F. Liu, C. Yu, and W. Meng, “Personalized Web Search by Mapping User Queries to Categories,” Proc. Int’l Conf. Information and Knowledge Management (CIKM), 2002. [5] Open Directory Project, http://www.dmoz.org/, 2009. [6] J. Sivic and A. Zisserman, “Video Google: A Text Retrieval Approach to Object Matching in Videos,” Proc. Int’l Conf. Computer Vision (ICCV), vol. 2, pp. 1470-1477, 2003. [7] Polsani, P.R.: Use and Abuse of Reusable Learning Objects. Journal of Digital Information, Volume 3 Issue 4, Article No. 164, 2003-02-19 [8] Johnston, P.: It’s all in the context. foundations’ metadata, middleware, e- learning,2006. http://efoundations.typepad.com/efoundations/2006/12/its_all_in_the_.html. Last accessed 23 Feb 2007 [9] Eduserv funded project MURLLO. Available at http://www.elanguages.ac.uk/murllo/ [10]M. Bilenko and R.J. Mooney, “Adaptive Duplicate Detection Using Learnable String Similarity Measures,” Proc. ACM SIGKDD, pp. 39-48, 2003.