SlideShare a Scribd company logo
1 of 11
Download to read offline
International Journal of Information Sciences and Techniques (IJIST) Vol.2, No.1, January 2012
DOI : 10.5121/ijist.2012.2104 41
A PERSONALIZED WEB PAGE CONTENT FILTERING
MODEL BASED ON SEGMENTATION
K.S.Kuppusamy1
and G.Aghila2
1
Department of Computer Science, School of Engineering and Technology, Pondicherry
University, Pondicherry, India
kskuppu@gmail.com
2
Department of Computer Science, School of Engineering and Technology, Pondicherry
University, Pondicherry, India
aghilaa@yahoo.com
ABSTRACT
In the view of massive content explosion in World Wide Web through diverse sources, it has become
mandatory to have content filtering tools. The filtering of contents of the web pages holds greater
significance in cases of access by minor-age people. The traditional web page blocking systems goes by the
Boolean methodology of either displaying the full page or blocking it completely. With the increased
dynamism in the web pages, it has become a common phenomenon that different portions of the web page
holds different types of content at different time instances. This paper proposes a model to block the
contents at a fine-grained level i.e. instead of completely blocking the page it would be efficient to block
only those segments which holds the contents to be blocked. The advantages of this method over the
traditional methods are fine-graining level of blocking and automatic identification of portions of the page
to be blocked. The experiments conducted on the proposed model indicate 88% of accuracy in filtering out
the segments.
KEYWORDS
Content Filtering, Segmentation, Web Page Blocking
1. INTRODUCTION
The World Wide Web (WWW) has become the biggest repository of information known to the
mankind. Yet another aspect which makes the World Wide Web more powerful is the ease with
which this largest repository can be accessed. With the prolific improvements in the Web Search
Engine’s functionalities, the distance to any information available in the World Wide Web is “a
single click”.
Though it can be considered an advantage, it poses certain vulnerabilities as well. With the
floodgates of information on World Wide Web open, the exposure to diverse information to all
users has created the necessity for the some sort of filtering mechanisms to this information
consumption process.
This paper proposes a technique which would facilitate this filtering mechanism. The objectives
of this paper are as listed below:
International Journal of Information Sciences and Techniques (IJIST) Vol.2, No.1, January 2012
42
• Proposing a model for web page content filtering based on segmentation.
• Incorporation of personalization in the proposed model to enhance the web content
filtering process.
The remainder of this paper is organized as follows: In Section 2, some of the related works
carried out in this domain are explored. Section 3 deals with the proposed model’s mathematical
representation and algorithms. Section 4 is about prototype implementation and experiments.
Section 5 focuses on the conclusions and future directions for this research work.
2.RELATED WORKS
This section would highlight the related works that have been carried out in this domain. The
proposed model incorporates the following two major fields of study:
• The Web Content Filtering
• Web Page Segmentation
2.1 The Web Content Filtering
Content Filtering Systems for web pages is an active research topic primarily due to following
reasons: It protects users (especially minor-age people) from unwanted content; the resources on
the network can be saved from unwanted usage like playing network games in an office network
etc. There exist many approaches to Content Filtering Systems. Some of them are as listed below:
• Rating Systems
• Black Listing / White Listing
• Keyword blocking
•
In Rating Systems users are asked to rate a web site for its content. This rating would be used as a
tool for filtering [1]. The black listing / white listing maintains a set of URLs manually prepared
for filtering. The problem with this approach is the scalability. There exist many tools available to
perform content filtering using above specified methods [2], [3], [4].
The text classification based approach is explored in [5], [6]. The approach that has been chosen
to facilitate filtering in this paper is a variation of keyword based blocking method.
2.2 Web Page Segmentation
Web page segmentation is an active research topic in the information retrieval domain in which a
wide range of experiments are conducted. Web page segmentation is the process of dividing a
web page into smaller units based on various criteria. The following are four basic types of web
page segmentation method:
• Fixed length page segmentation
• DOM based page segmentation
• Vision based page segmentation
• Combined / Hybrid method
A comparative study among all these four types of segmentation is illustrated in [7]. Each of
above mentioned segmentation methods have been studied in detail in the literature. Fixed length
page segmentation is simple and less complex in terms of implementation but the major problem
International Journal of Information Sciences and Techniques (IJIST) Vol.2, No.1, January 2012
43
with this approach is that it doesn’t consider any semantics of the page while segmenting. In
DOM base page segmentation, the HTML tag tree’s Document Object Model would be used
while segmenting. An arbitrary passages based approach is given in [8]. Vision based page
segmentation (VIPS) is in parallel lines with the way, humans views a page. VIPS [9] is a popular
segmentation algorithm which segments a page based on various visual features.
Apart from the above mentioned segmentation methods a few novel approaches have been
evolved during the last few years. An image processing based segmentation approach is
illustrated in [10]. The segmentation process based text density of the contents is explained in
[11]. The graph theory based approach to segmentation is presented in [12]. Repetition-based web
page segmentation by detecting tag patterns for small-Screen Devices is explored in [13]. One of
the approaches for web page segmentation for specific domains is detailed in [14]. A tree
clustering based segmentation approach is provided in [15].
3. THE MODEL
This section elaborates about the mathematical model of the proposed system. The corresponding
algorithm to carry out the task specified in the model is also explored in this section. The block
diagram of the proposed model is as shown in Figure 1. It contains the following components:
Page Segmentor: This component is responsible for segmenting the contents of the page in to
logically relevant units.
Personalizer: This component handles the personalization of filtering. The Personalizer holds the
profile-bag which contains user preferences.
Segment Filter: Segment filter is another component in the model which handles individual
segments and decides whether this segment should be incorporated in the filtered page or not.
3.1 Mathematical Model
In the proposed model each page that the user requests need to be segmented for filtration. Let us
denote the source page by Φ . The source page Φ has to be segmented in to various logically
coherent parts.
The source page Φ would be mapped as a DOM (Document Object Model) tree. The individual
nodes of the DOM tree are processed by parsing the tree. The “block level” and “non-block level”
nodes are identified and they are used as the building block of the individual segments.
The approach followed in this paper also incorporates the densitometry concepts in the segment
building process. The densitometry considers the density of text present at a block unit in
performing the segmentation process. As a result of the above mentioned process, the source page
Φ is segmented in to various units as shown in (1).
{ }
1 2, 3...
, n
µ µ µ µ
Φ =
(1)
The segmentation process shown in (1) is performed by the “Page Segmentor” component in the
proposed model.
After the completion of the segmentation process, each of the segments needs to be processed
individually. This is performed by the “Segment Filter Component”. Each segment i
µ is
represented as a triple containing three components as shown listed below:
International Journal of Information Sciences and Techniques (IJIST) Vol.2, No.1, January 2012
44
• Text
• Link
• Image
Figure 1. Block Diagram of the Model
The segment triple containing text, link and image is represented as shown in (2).
{ }
, ,
i
µ = Ψ Λ Θ
(2)
The triple i
µ can be expanded as shown in (3).
International Journal of Information Sciences and Techniques (IJIST) Vol.2, No.1, January 2012
45
[ ]
[ ]
[ ]
1 2,
1 2
1 2
, ...
, ,...
, ,...
p i
q i
i
r i
η η η η
κ κ κ κ
µ
λ λ λ λ
 
∀ ∈ Ψ
 
∀ ∈ Λ
=  
 
∀ ∈ Θ
  (3)
In (3) [ ]
1 2, .
, .. p
η η η
represent the text elements present in the segment under consideration;
[ ]
1 2
, ,.. q
κ κ κ
represent the individual links presents in the segment and [ ]
1 2
, ... r
λ λ λ
represent the
image elements present in the segments.
The individual segments need to be processed for each of these three components to decide
whether this segment can be allowed for display or it needs to be blocked. In order to perform
this, segment filter component includes three sub-components a) Text Filter, b) Link Filter and c)
Image Filter. The focus of this research work is on the effect of segmentation and personalization.
The actual filtration process can be either simple keyword based or it can be customized
according to the requirements of implementation.
The proposed model incorporates personalization aspect. The user can configure the filter
according to his/her requirements. The user preferences are represented using “Profile Bag”. The
profile bag involves two different tracks. These tracks are “Like Track” and “Un-Like Track”.
The block diagram of profile-bag is as shown in Figure 2. The figure consists of three horizontal
layers. The top layer denotes the overall profile-bag. The middle one represents the “Like-Track”
and “Un-Like Track”. The bottom layer in the Figure 2 denotes the keywords which form the
“Like-Track” and “Un-Like Track”.
The profile bag is represented in the model as Γ . The two different tracks of Γ are represented as
shown in (4).
ω
σ
Γ =
(4)
In (4) ω represent the “Like Track” and σ represent the “Un-Like Track” of the profile bag.
Both ω and σ contains keywords that represent the user preferences. The keywords in ω adds a
positive booster and the keywords in σ adds a negative booster.
The filtration process can be represented as shown in (5). As a result of (5) the Text Weight, Link
Weight and Image Weight are calculated as the sum of number of terms common between ω and
elements giving a “+1” weight and number of terms common between σ and elements giving a “-
1” weight.
International Journal of Information Sciences and Techniques (IJIST) Vol.2, No.1, January 2012
46
{ }
[ ]
[ ]
[ ]
1 2,
1 2
1 2
, ...
, ,...
, ,
, ,...
p
i
q
i
r
i
η η η
η
κ κ κ
κ
λ λ λ
λ
 
∀ ∈ Ψ
 
Γ
 
 
 
Ψ Λ Θ = ∀ ∈ Λ
 
Γ
 
 
 
∀ ∈ Θ
 
Γ
  (5)
If the sum of weights of all these three components exceeds a threshold level the segment is
displayed otherwise it is blocked.
( )
:
i i
z
if
else
µ δ µ
µ
 
∀ ∈Φ Ψ + Λ + Θ ≥ Φ∪
 
Φ =  
Φ∪
 
  (6)
In (6), Φ represents the filtered page in which segments whose weight has been calculated above
the threshold limit are incorporated. When the weight is less than the threshold then a dummy
segment z
µ holding the message “segment blocked” would be added to the page.
Figure 2. The User Profile - Bag
The dummy segment which would replace the filtered segment can be custom defined. The
proposed model has another feature called “link hiding”. In the case of link hiding, if the content
to be blocked is having a hyperlink, instead of removing the content, the hyperlink alone can be
removed which creates the similar impact as removing the content.
3.2 The Algorithm
The algorithmic representation of the steps involved in the above explained model is explored in
this section.
International Journal of Information Sciences and Techniques (IJIST) Vol.2, No.1, January 2012
47
In the algorithm, TF, LF and IF refers to text filter, link filter and image filter respectively.
4. EXPERIMENTS AND RESULT ANALYSIS
The proposed model has been implemented as prototype for experimentation. The prototype
implementation is done with the software stack including Linux, Apache, MySql and PHP. For
client side scripting JavaScript is used. With respect to the hardware, a Core i3 processor system
with 3 GHz of speed, 8 GB of RAM is used. The internet connection used in the experimental
setup is a 128 Mbps leased line.
The screenshots of the prototype implementation are as shown in the Figure 3 and Figure 4. The
screenshot shown in Figure 3 is of the original source page.
Algorithm SegmentFilter
Input: Source Web Page Φ , profile bag Γ
Output : Filtered Page Φ
Begin
Segment the source page using page segmentor
{ }
1 2, 3...
, n
µ µ µ µ
Φ =
Initialize Φ to NULL
For each segment i
µ
begin
Parse the segment 1
µ into components { }
, ,
Ψ Λ Θ
Calculate Text weight
Ψ
= TF ( Ψ / Γ )
Calculate Link Weight
Λ
= LF ( Λ / Γ )
Calculate Image Weight
Θ
= IF ( Θ / Γ )
If
( ) δ
Ψ + Λ + Θ ≥
then
Φ = Φ ∪ i
µ
Else
Φ = Φ ∪ z
µ
End
Return ( Φ )
End
International Journal of Information Sciences and Techniques (IJIST) Vol.2, No.1, January 2012
48
Figure 3. The Source Page
The page segments are filtered out based on the filtering preferences set up. The resultant page is
as shown in Figure 4.
Figure 4. The Page after filtering the unwanted segments.
In Figure 4, it can be noted that the segments containing the terms “Entertainment Software” and
“Games” are filtered out as per the filtering preferences set. The contents of Table 1 list out the
experimental results conducted on the proposed content filtering model. In the Table 1, MSC
indicates the mean segment count, MFSC stands mean filtered segment count, MFP is mean false
positives and MFN is mean false negative.
International Journal of Information Sciences and Techniques (IJIST) Vol.2, No.1, January 2012
49
Table 1. Experimental Results of the proposed model
Session
ID
MSC MFSC MFP MFN Accuracy (%)
1 27.52 5.2 1.2 1.5 90.189
2 30.25 3.5 0.8 1.2 93.388
3 43.53 4.5 1.3 1.3 94.027
4 20.67 2.7 0.7 0.2 95.646
5 18.45 1.5 1.6 0.4 89.16
6 14.66 2.3 3.5 0.5 72.715
7 16.78 4.3 3.1 1.1 74.97
8 17.67 1.3 1.2 1.5 84.72
9 14.85 1.8 0.5 0.8 91.246
10 25.52 2.6 0.9 0.9 92.947
11 12.45 5.2 0.9 1.3 82.329
12 22.15 5.1 0.6 1.4 90.971
13 23.45 3.9 1.1 0.6 92.751
14 25.45 4.2 1.2 0.8 92.141
15 12.45 4.3 1.8 1.1 76.707
The chart in Figure 5 compares the average number of segments filtered out in a session, the false
positives and the false negatives. It can be observed that the mean of MFSC across the session is
3.49, whereas the mean of MFP and MFN are 1.3 and 0.9 respectively.
Figure 5. Comparison of MFSC, MFP and MFN
The chart in Figure 6 compares the Mean Segment Count with the accuracy. It can be observed
that the mean accuracy of filtering across the session is 87.59 which confirm the efficiency of the
proposed content filtering model.
International Journal of Information Sciences and Techniques (IJIST) Vol.2, No.1, January 2012
50
Figure 6. Comparison of MSC and Accuracy
5. CONCLUSIONS AND FUTURE DIRECTIONS
The proposed model for page filtering using segmentation and personalization renders the
following advantages:
• Instead of blocking the entire page in cases where the content to be blocked is present
only at a portion of the page, the proposed model provides a distinct benefit to user.
• Incorporation of personalization in the blocking process provides a tailor made
content filtering system based on the user’s needs.
The future directions for this research work are as listed below:
• In the proposed model the image filtering happens using the “alt” text provided with
the image. In the future implementations some of the image analysis modules can be
incorporated to make the image filtering much more efficient.
• Incorporation of the capability to handle languages other than English would make
the system more efficient in the cases of non-English web pages.
REFERENCES
[1] Paul Resnick and Jim Miller. PICS: Internet access controls without censorship. Communications of
the AGM, 39(10):87-93, 1996.
[2] Net Nanny, Available : http://wuw.netnanny.com
[3] Cyber Patrol, Available : http://www.cyberpatrol.com/
[4] Cyber Sitter, Available: http://www.cybersitter.com
[5] Du, R.; Safavi-Naini, R.; Susilo, W.; Web filtering using text classification, The 11th IEEE
International Conference on Networks, 2003. ICON2003.pages:325 - 330
International Journal of Information Sciences and Techniques (IJIST) Vol.2, No.1, January 2012
51
[6] Weiming Hu, Ou Wu, Zhouyao Chen, Zhouyu Fu, Maybank, S., "Recognition of Pornographic
Web Pages by Classifying Texts and Images", Pattern Analysis and Machine Intelligence, IEEE
Transactions on, On page(s): 1019 - 1034, Volume: 29 Issue: 6, June 2007.
[7] Deng Cai, Shipeng Yu, Ji-Rong Wen, and Wei-Ying Ma. Block-based web search. In SIGIR ’04:
Proceedings of the 27th annual international ACM SIGIR conference on Research and development
in information retrieval, pages 456–463, New York, NY, USA, 2004. ACM
[8] Kaszkiel, M. and Zobel, J., Effective Ranking with Arbitrary Passages, Journal of the American
Society for Information Science, Vol. 52, No. 4, 2001, pp. 344-364.
[9] D. Cai, S. Yu, J. Wen, and W.-Y. Ma, VIPS: A vision-based page segmentation algorithm, Tech. Rep.
MSR-TR-2003-79, 2003.
[10] Cao, Jiuxin , Mao, Bo and Luo, Junzhou, 'A segmentation method for web page analysis using
shrinking and dividing', International Journal of Parallel, Emergent and Distributed Systems, 25: 2, 93
— 104, 2010.
[11] Kohlschütter, C. and Nejdl, W. A densitometric approach to web page segmentation. In Proceeding of
the 17th ACM Conference on information and Knowledge Management (Napa Valley, California,
USA, October 26 - 30, 2008). CIKM '08. ACM, New York, NY, 1173-1182, 2008.
[12] Deepayan Chakrabarti , Ravi Kumar , Kunal Punera, A graph-theoretic approach to webpage
segmentation, Proceeding of the 17th international conference on World Wide Web, April 21-25,
Beijing, China, 2008.
[13] Jinbeom Kang, Jaeyoung Yang, Joongmin Choi, “Repetition-based Web Page Segmentation by
Detecting Tag Patterns for Small-Screen Devices”, IEEE Transactions on Consumer Electronics,
IEEE, vol. 56, no. 2, pp.980-986, 2010.
[14] Madaan Aastha, Chu Wanming, Author: Bhalla Subhash, "VisHue: Web Page Segmentation for an
Improved Query Interface for MedlinePlus Medical Encyclopedia", Databases in Networked
Information Systems, Springer Lecture Notes in Computer Science, 2011.
[15] Xinyue Liu, Hongfei Lin, Ye Tian, Segmenting Webpage with Gomory-Hu Tree Based Clustering,
Journal of Software, Vol 6, No 12 (2011), 2421-2425.
Authors
K.S.Kuppusamy is an Assistant Professor at Department of Computer Science, School
of Engineering and Technology, Pondicherry University, Pondicherry, India. He has
obtained his Masters degree in Computer Science and Information Technology from
Madurai Kamaraj University. He is currently pursuing his Ph.D in the field of
Intelligent Information Management. His research interest includes Web Search
Engines, Semantic Web. He has made 8 international publications.
G. Aghila is a Professor at Department of Computer Science, School of Engineering
and Technology, Pondicherry University, Pondicherry, India. She has got a total of 22
years of teaching experience. She has received her M.E (Computer Science and
Engineering) and Ph.D. from Anna University, Chennai, India. She has published more
than 55 research papers in web crawlers, ontology based information retrieval. She is
currently a supervisor guiding 8 Ph.D. scholars. She was in receipt of Schrneiger award.
She is an expert in ontology development. Her area of interest includes Intelligent
Information Management, artificial intelligence, text mining and semantic web
technologies.

More Related Content

What's hot

Semantic Conflicts and Solutions in Integration of Fuzzy Relational Databases
Semantic Conflicts and Solutions in Integration of Fuzzy Relational DatabasesSemantic Conflicts and Solutions in Integration of Fuzzy Relational Databases
Semantic Conflicts and Solutions in Integration of Fuzzy Relational Databasesijsrd.com
 
Research Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and ScienceResearch Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and Scienceresearchinventy
 
A Web Extraction Using Soft Algorithm for Trinity Structure
A Web Extraction Using Soft Algorithm for Trinity StructureA Web Extraction Using Soft Algorithm for Trinity Structure
A Web Extraction Using Soft Algorithm for Trinity Structureiosrjce
 
A novel method for generating an elearning ontology
A novel method for generating an elearning ontologyA novel method for generating an elearning ontology
A novel method for generating an elearning ontologyIJDKP
 
Multi Label Spatial Semi Supervised Classification using Spatial Associative ...
Multi Label Spatial Semi Supervised Classification using Spatial Associative ...Multi Label Spatial Semi Supervised Classification using Spatial Associative ...
Multi Label Spatial Semi Supervised Classification using Spatial Associative ...cscpconf
 
IRJET- Concept Extraction from Ambiguous Text Document using K-Means
IRJET- Concept Extraction from Ambiguous Text Document using K-MeansIRJET- Concept Extraction from Ambiguous Text Document using K-Means
IRJET- Concept Extraction from Ambiguous Text Document using K-MeansIRJET Journal
 
Community profiling for social networks
Community profiling for social networksCommunity profiling for social networks
Community profiling for social networkseSAT Publishing House
 
Multi-Topic Multi-Document Summarizer
Multi-Topic Multi-Document SummarizerMulti-Topic Multi-Document Summarizer
Multi-Topic Multi-Document Summarizerijcsit
 
In tech application-of_data_mining_technology_on_e_learning_material_recommen...
In tech application-of_data_mining_technology_on_e_learning_material_recommen...In tech application-of_data_mining_technology_on_e_learning_material_recommen...
In tech application-of_data_mining_technology_on_e_learning_material_recommen...Enhmandah Hemeelee
 
Development of pattern knowledge discovery framework using
Development of pattern knowledge discovery framework usingDevelopment of pattern knowledge discovery framework using
Development of pattern knowledge discovery framework usingIAEME Publication
 
Performance Evaluation of Query Processing Techniques in Information Retrieval
Performance Evaluation of Query Processing Techniques in Information RetrievalPerformance Evaluation of Query Processing Techniques in Information Retrieval
Performance Evaluation of Query Processing Techniques in Information Retrievalidescitation
 
zmet-mapping the mind of the mobile consumer across borders
zmet-mapping the mind of the mobile consumer across borderszmet-mapping the mind of the mobile consumer across borders
zmet-mapping the mind of the mobile consumer across bordersLisaIndah1
 
The Statement of Conjunctive and Disjunctive Queries in Object Oriented Datab...
The Statement of Conjunctive and Disjunctive Queries in Object Oriented Datab...The Statement of Conjunctive and Disjunctive Queries in Object Oriented Datab...
The Statement of Conjunctive and Disjunctive Queries in Object Oriented Datab...Editor IJCATR
 
On The Automated Classification of Web Pages Using Artificial Neural Network
On The Automated Classification of Web Pages Using Artificial  Neural NetworkOn The Automated Classification of Web Pages Using Artificial  Neural Network
On The Automated Classification of Web Pages Using Artificial Neural NetworkIOSR Journals
 
Semantic web personalization
Semantic web personalizationSemantic web personalization
Semantic web personalizationAlexander Decker
 
Novel Database-Centric Framework for Incremental Information Extraction
Novel Database-Centric Framework for Incremental Information ExtractionNovel Database-Centric Framework for Incremental Information Extraction
Novel Database-Centric Framework for Incremental Information Extractionijsrd.com
 

What's hot (19)

Semantic Conflicts and Solutions in Integration of Fuzzy Relational Databases
Semantic Conflicts and Solutions in Integration of Fuzzy Relational DatabasesSemantic Conflicts and Solutions in Integration of Fuzzy Relational Databases
Semantic Conflicts and Solutions in Integration of Fuzzy Relational Databases
 
Research Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and ScienceResearch Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and Science
 
A Web Extraction Using Soft Algorithm for Trinity Structure
A Web Extraction Using Soft Algorithm for Trinity StructureA Web Extraction Using Soft Algorithm for Trinity Structure
A Web Extraction Using Soft Algorithm for Trinity Structure
 
A novel method for generating an elearning ontology
A novel method for generating an elearning ontologyA novel method for generating an elearning ontology
A novel method for generating an elearning ontology
 
Multi Label Spatial Semi Supervised Classification using Spatial Associative ...
Multi Label Spatial Semi Supervised Classification using Spatial Associative ...Multi Label Spatial Semi Supervised Classification using Spatial Associative ...
Multi Label Spatial Semi Supervised Classification using Spatial Associative ...
 
IRJET- Concept Extraction from Ambiguous Text Document using K-Means
IRJET- Concept Extraction from Ambiguous Text Document using K-MeansIRJET- Concept Extraction from Ambiguous Text Document using K-Means
IRJET- Concept Extraction from Ambiguous Text Document using K-Means
 
Community profiling for social networks
Community profiling for social networksCommunity profiling for social networks
Community profiling for social networks
 
50120140503012
5012014050301250120140503012
50120140503012
 
Multi-Topic Multi-Document Summarizer
Multi-Topic Multi-Document SummarizerMulti-Topic Multi-Document Summarizer
Multi-Topic Multi-Document Summarizer
 
In tech application-of_data_mining_technology_on_e_learning_material_recommen...
In tech application-of_data_mining_technology_on_e_learning_material_recommen...In tech application-of_data_mining_technology_on_e_learning_material_recommen...
In tech application-of_data_mining_technology_on_e_learning_material_recommen...
 
Development of pattern knowledge discovery framework using
Development of pattern knowledge discovery framework usingDevelopment of pattern knowledge discovery framework using
Development of pattern knowledge discovery framework using
 
AN IMPROVED TECHNIQUE FOR DOCUMENT CLUSTERING
AN IMPROVED TECHNIQUE FOR DOCUMENT CLUSTERINGAN IMPROVED TECHNIQUE FOR DOCUMENT CLUSTERING
AN IMPROVED TECHNIQUE FOR DOCUMENT CLUSTERING
 
Performance Evaluation of Query Processing Techniques in Information Retrieval
Performance Evaluation of Query Processing Techniques in Information RetrievalPerformance Evaluation of Query Processing Techniques in Information Retrieval
Performance Evaluation of Query Processing Techniques in Information Retrieval
 
zmet-mapping the mind of the mobile consumer across borders
zmet-mapping the mind of the mobile consumer across borderszmet-mapping the mind of the mobile consumer across borders
zmet-mapping the mind of the mobile consumer across borders
 
The Statement of Conjunctive and Disjunctive Queries in Object Oriented Datab...
The Statement of Conjunctive and Disjunctive Queries in Object Oriented Datab...The Statement of Conjunctive and Disjunctive Queries in Object Oriented Datab...
The Statement of Conjunctive and Disjunctive Queries in Object Oriented Datab...
 
On The Automated Classification of Web Pages Using Artificial Neural Network
On The Automated Classification of Web Pages Using Artificial  Neural NetworkOn The Automated Classification of Web Pages Using Artificial  Neural Network
On The Automated Classification of Web Pages Using Artificial Neural Network
 
320 324
320 324320 324
320 324
 
Semantic web personalization
Semantic web personalizationSemantic web personalization
Semantic web personalization
 
Novel Database-Centric Framework for Incremental Information Extraction
Novel Database-Centric Framework for Incremental Information ExtractionNovel Database-Centric Framework for Incremental Information Extraction
Novel Database-Centric Framework for Incremental Information Extraction
 

Similar to A PERSONALIZED WEB PAGE CONTENT FILTERING MODEL BASED ON SEGMENTATION

A Multimodal Approach to Incremental User Profile Building
A Multimodal Approach to Incremental User Profile Building A Multimodal Approach to Incremental User Profile Building
A Multimodal Approach to Incremental User Profile Building dannyijwest
 
Web Content Mining Based on Dom Intersection and Visual Features Concept
Web Content Mining Based on Dom Intersection and Visual Features ConceptWeb Content Mining Based on Dom Intersection and Visual Features Concept
Web Content Mining Based on Dom Intersection and Visual Features Conceptijceronline
 
Query Sensitive Comparative Summarization of Search Results Using Concept Bas...
Query Sensitive Comparative Summarization of Search Results Using Concept Bas...Query Sensitive Comparative Summarization of Search Results Using Concept Bas...
Query Sensitive Comparative Summarization of Search Results Using Concept Bas...CSEIJJournal
 
QUERY SENSITIVE COMPARATIVE SUMMARIZATION OF SEARCH RESULTS USING CONCEPT BAS...
QUERY SENSITIVE COMPARATIVE SUMMARIZATION OF SEARCH RESULTS USING CONCEPT BAS...QUERY SENSITIVE COMPARATIVE SUMMARIZATION OF SEARCH RESULTS USING CONCEPT BAS...
QUERY SENSITIVE COMPARATIVE SUMMARIZATION OF SEARCH RESULTS USING CONCEPT BAS...cseij
 
Journal of Physics Conference SeriesPAPER • OPEN ACCESS.docx
Journal of Physics Conference SeriesPAPER • OPEN ACCESS.docxJournal of Physics Conference SeriesPAPER • OPEN ACCESS.docx
Journal of Physics Conference SeriesPAPER • OPEN ACCESS.docxLaticiaGrissomzz
 
Information Retrieval based on Cluster Analysis Approach
Information Retrieval based on Cluster Analysis ApproachInformation Retrieval based on Cluster Analysis Approach
Information Retrieval based on Cluster Analysis ApproachAIRCC Publishing Corporation
 
INFORMATION RETRIEVAL BASED ON CLUSTER ANALYSIS APPROACH
INFORMATION RETRIEVAL BASED ON CLUSTER ANALYSIS APPROACHINFORMATION RETRIEVAL BASED ON CLUSTER ANALYSIS APPROACH
INFORMATION RETRIEVAL BASED ON CLUSTER ANALYSIS APPROACHijcsit
 
The Data Records Extraction from Web Pages
The Data Records Extraction from Web PagesThe Data Records Extraction from Web Pages
The Data Records Extraction from Web Pagesijtsrd
 
Recent research in web page classification – a review
Recent research in web page classification – a reviewRecent research in web page classification – a review
Recent research in web page classification – a reviewiaemedu
 
Recent research in web page classification – a review
Recent research in web page classification – a reviewRecent research in web page classification – a review
Recent research in web page classification – a reviewIAEME Publication
 
MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...
MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...
MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...acijjournal
 
Application of fuzzy logic for user
Application of fuzzy logic for userApplication of fuzzy logic for user
Application of fuzzy logic for userIJCI JOURNAL
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Paper id 37201536
Paper id 37201536Paper id 37201536
Paper id 37201536IJRAT
 
Navigation Cost Modeling Based On Ontology
Navigation Cost Modeling Based On OntologyNavigation Cost Modeling Based On Ontology
Navigation Cost Modeling Based On OntologyIOSR Journals
 
A survey on ontology based web personalization
A survey on ontology based web personalizationA survey on ontology based web personalization
A survey on ontology based web personalizationeSAT Journals
 
A survey on ontology based web personalization
A survey on ontology based web personalizationA survey on ontology based web personalization
A survey on ontology based web personalizationeSAT Publishing House
 

Similar to A PERSONALIZED WEB PAGE CONTENT FILTERING MODEL BASED ON SEGMENTATION (20)

A Multimodal Approach to Incremental User Profile Building
A Multimodal Approach to Incremental User Profile Building A Multimodal Approach to Incremental User Profile Building
A Multimodal Approach to Incremental User Profile Building
 
Web Content Mining Based on Dom Intersection and Visual Features Concept
Web Content Mining Based on Dom Intersection and Visual Features ConceptWeb Content Mining Based on Dom Intersection and Visual Features Concept
Web Content Mining Based on Dom Intersection and Visual Features Concept
 
International Journal of Engineering Inventions (IJEI),
International Journal of Engineering Inventions (IJEI), International Journal of Engineering Inventions (IJEI),
International Journal of Engineering Inventions (IJEI),
 
Query Sensitive Comparative Summarization of Search Results Using Concept Bas...
Query Sensitive Comparative Summarization of Search Results Using Concept Bas...Query Sensitive Comparative Summarization of Search Results Using Concept Bas...
Query Sensitive Comparative Summarization of Search Results Using Concept Bas...
 
QUERY SENSITIVE COMPARATIVE SUMMARIZATION OF SEARCH RESULTS USING CONCEPT BAS...
QUERY SENSITIVE COMPARATIVE SUMMARIZATION OF SEARCH RESULTS USING CONCEPT BAS...QUERY SENSITIVE COMPARATIVE SUMMARIZATION OF SEARCH RESULTS USING CONCEPT BAS...
QUERY SENSITIVE COMPARATIVE SUMMARIZATION OF SEARCH RESULTS USING CONCEPT BAS...
 
Journal of Physics Conference SeriesPAPER • OPEN ACCESS.docx
Journal of Physics Conference SeriesPAPER • OPEN ACCESS.docxJournal of Physics Conference SeriesPAPER • OPEN ACCESS.docx
Journal of Physics Conference SeriesPAPER • OPEN ACCESS.docx
 
CloWSer
CloWSerCloWSer
CloWSer
 
Information Retrieval based on Cluster Analysis Approach
Information Retrieval based on Cluster Analysis ApproachInformation Retrieval based on Cluster Analysis Approach
Information Retrieval based on Cluster Analysis Approach
 
INFORMATION RETRIEVAL BASED ON CLUSTER ANALYSIS APPROACH
INFORMATION RETRIEVAL BASED ON CLUSTER ANALYSIS APPROACHINFORMATION RETRIEVAL BASED ON CLUSTER ANALYSIS APPROACH
INFORMATION RETRIEVAL BASED ON CLUSTER ANALYSIS APPROACH
 
The Data Records Extraction from Web Pages
The Data Records Extraction from Web PagesThe Data Records Extraction from Web Pages
The Data Records Extraction from Web Pages
 
Recent research in web page classification – a review
Recent research in web page classification – a reviewRecent research in web page classification – a review
Recent research in web page classification – a review
 
Recent research in web page classification – a review
Recent research in web page classification – a reviewRecent research in web page classification – a review
Recent research in web page classification – a review
 
MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...
MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...
MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...
 
Application of fuzzy logic for user
Application of fuzzy logic for userApplication of fuzzy logic for user
Application of fuzzy logic for user
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
 
Paper id 37201536
Paper id 37201536Paper id 37201536
Paper id 37201536
 
F0433439
F0433439F0433439
F0433439
 
Navigation Cost Modeling Based On Ontology
Navigation Cost Modeling Based On OntologyNavigation Cost Modeling Based On Ontology
Navigation Cost Modeling Based On Ontology
 
A survey on ontology based web personalization
A survey on ontology based web personalizationA survey on ontology based web personalization
A survey on ontology based web personalization
 
A survey on ontology based web personalization
A survey on ontology based web personalizationA survey on ontology based web personalization
A survey on ontology based web personalization
 

More from ijistjournal

International Journal of Information Sciences and Techniques (IJIST)
International Journal of Information Sciences and Techniques (IJIST)International Journal of Information Sciences and Techniques (IJIST)
International Journal of Information Sciences and Techniques (IJIST)ijistjournal
 
BRAIN TUMOR MRIIMAGE CLASSIFICATION WITH FEATURE SELECTION AND EXTRACTION USI...
BRAIN TUMOR MRIIMAGE CLASSIFICATION WITH FEATURE SELECTION AND EXTRACTION USI...BRAIN TUMOR MRIIMAGE CLASSIFICATION WITH FEATURE SELECTION AND EXTRACTION USI...
BRAIN TUMOR MRIIMAGE CLASSIFICATION WITH FEATURE SELECTION AND EXTRACTION USI...ijistjournal
 
Research Article Submission - International Journal of Information Sciences a...
Research Article Submission - International Journal of Information Sciences a...Research Article Submission - International Journal of Information Sciences a...
Research Article Submission - International Journal of Information Sciences a...ijistjournal
 
A MEDIAN BASED DIRECTIONAL CASCADED WITH MASK FILTER FOR REMOVAL OF RVIN
A MEDIAN BASED DIRECTIONAL CASCADED WITH MASK FILTER FOR REMOVAL OF RVINA MEDIAN BASED DIRECTIONAL CASCADED WITH MASK FILTER FOR REMOVAL OF RVIN
A MEDIAN BASED DIRECTIONAL CASCADED WITH MASK FILTER FOR REMOVAL OF RVINijistjournal
 
DECEPTION AND RACISM IN THE TUSKEGEE SYPHILIS STUDY
DECEPTION AND RACISM IN THE TUSKEGEE  SYPHILIS STUDYDECEPTION AND RACISM IN THE TUSKEGEE  SYPHILIS STUDY
DECEPTION AND RACISM IN THE TUSKEGEE SYPHILIS STUDYijistjournal
 
Online Paper Submission - International Journal of Information Sciences and T...
Online Paper Submission - International Journal of Information Sciences and T...Online Paper Submission - International Journal of Information Sciences and T...
Online Paper Submission - International Journal of Information Sciences and T...ijistjournal
 
A NOVEL APPROACH FOR SEGMENTATION OF SECTOR SCAN SONAR IMAGES USING ADAPTIVE ...
A NOVEL APPROACH FOR SEGMENTATION OF SECTOR SCAN SONAR IMAGES USING ADAPTIVE ...A NOVEL APPROACH FOR SEGMENTATION OF SECTOR SCAN SONAR IMAGES USING ADAPTIVE ...
A NOVEL APPROACH FOR SEGMENTATION OF SECTOR SCAN SONAR IMAGES USING ADAPTIVE ...ijistjournal
 
Call for Papers - International Journal of Information Sciences and Technique...
Call for Papers - International Journal of Information Sciences and Technique...Call for Papers - International Journal of Information Sciences and Technique...
Call for Papers - International Journal of Information Sciences and Technique...ijistjournal
 
Online Paper Submission - 4th International Conference on NLP & Data Mining (...
Online Paper Submission - 4th International Conference on NLP & Data Mining (...Online Paper Submission - 4th International Conference on NLP & Data Mining (...
Online Paper Submission - 4th International Conference on NLP & Data Mining (...ijistjournal
 
TOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWS
TOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWSTOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWS
TOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWSijistjournal
 
Research Articles Submission - International Journal of Information Sciences ...
Research Articles Submission - International Journal of Information Sciences ...Research Articles Submission - International Journal of Information Sciences ...
Research Articles Submission - International Journal of Information Sciences ...ijistjournal
 
ANALYSIS OF IMAGE WATERMARKING USING LEAST SIGNIFICANT BIT ALGORITHM
ANALYSIS OF IMAGE WATERMARKING USING LEAST SIGNIFICANT BIT ALGORITHMANALYSIS OF IMAGE WATERMARKING USING LEAST SIGNIFICANT BIT ALGORITHM
ANALYSIS OF IMAGE WATERMARKING USING LEAST SIGNIFICANT BIT ALGORITHMijistjournal
 
Call for Research Articles - 6th International Conference on Machine Learning...
Call for Research Articles - 6th International Conference on Machine Learning...Call for Research Articles - 6th International Conference on Machine Learning...
Call for Research Articles - 6th International Conference on Machine Learning...ijistjournal
 
Online Paper Submission - International Journal of Information Sciences and T...
Online Paper Submission - International Journal of Information Sciences and T...Online Paper Submission - International Journal of Information Sciences and T...
Online Paper Submission - International Journal of Information Sciences and T...ijistjournal
 
Colour Image Steganography Based on Pixel Value Differencing in Spatial Domain
Colour Image Steganography Based on Pixel Value Differencing in Spatial DomainColour Image Steganography Based on Pixel Value Differencing in Spatial Domain
Colour Image Steganography Based on Pixel Value Differencing in Spatial Domainijistjournal
 
Call for Research Articles - 5th International conference on Advanced Natural...
Call for Research Articles - 5th International conference on Advanced Natural...Call for Research Articles - 5th International conference on Advanced Natural...
Call for Research Articles - 5th International conference on Advanced Natural...ijistjournal
 
International Journal of Information Sciences and Techniques (IJIST)
International Journal of Information Sciences and Techniques (IJIST)International Journal of Information Sciences and Techniques (IJIST)
International Journal of Information Sciences and Techniques (IJIST)ijistjournal
 
Research Article Submission - International Journal of Information Sciences a...
Research Article Submission - International Journal of Information Sciences a...Research Article Submission - International Journal of Information Sciences a...
Research Article Submission - International Journal of Information Sciences a...ijistjournal
 
Design and Implementation of LZW Data Compression Algorithm
Design and Implementation of LZW Data Compression AlgorithmDesign and Implementation of LZW Data Compression Algorithm
Design and Implementation of LZW Data Compression Algorithmijistjournal
 
Online Paper Submission - 5th International Conference on Soft Computing, Art...
Online Paper Submission - 5th International Conference on Soft Computing, Art...Online Paper Submission - 5th International Conference on Soft Computing, Art...
Online Paper Submission - 5th International Conference on Soft Computing, Art...ijistjournal
 

More from ijistjournal (20)

International Journal of Information Sciences and Techniques (IJIST)
International Journal of Information Sciences and Techniques (IJIST)International Journal of Information Sciences and Techniques (IJIST)
International Journal of Information Sciences and Techniques (IJIST)
 
BRAIN TUMOR MRIIMAGE CLASSIFICATION WITH FEATURE SELECTION AND EXTRACTION USI...
BRAIN TUMOR MRIIMAGE CLASSIFICATION WITH FEATURE SELECTION AND EXTRACTION USI...BRAIN TUMOR MRIIMAGE CLASSIFICATION WITH FEATURE SELECTION AND EXTRACTION USI...
BRAIN TUMOR MRIIMAGE CLASSIFICATION WITH FEATURE SELECTION AND EXTRACTION USI...
 
Research Article Submission - International Journal of Information Sciences a...
Research Article Submission - International Journal of Information Sciences a...Research Article Submission - International Journal of Information Sciences a...
Research Article Submission - International Journal of Information Sciences a...
 
A MEDIAN BASED DIRECTIONAL CASCADED WITH MASK FILTER FOR REMOVAL OF RVIN
A MEDIAN BASED DIRECTIONAL CASCADED WITH MASK FILTER FOR REMOVAL OF RVINA MEDIAN BASED DIRECTIONAL CASCADED WITH MASK FILTER FOR REMOVAL OF RVIN
A MEDIAN BASED DIRECTIONAL CASCADED WITH MASK FILTER FOR REMOVAL OF RVIN
 
DECEPTION AND RACISM IN THE TUSKEGEE SYPHILIS STUDY
DECEPTION AND RACISM IN THE TUSKEGEE  SYPHILIS STUDYDECEPTION AND RACISM IN THE TUSKEGEE  SYPHILIS STUDY
DECEPTION AND RACISM IN THE TUSKEGEE SYPHILIS STUDY
 
Online Paper Submission - International Journal of Information Sciences and T...
Online Paper Submission - International Journal of Information Sciences and T...Online Paper Submission - International Journal of Information Sciences and T...
Online Paper Submission - International Journal of Information Sciences and T...
 
A NOVEL APPROACH FOR SEGMENTATION OF SECTOR SCAN SONAR IMAGES USING ADAPTIVE ...
A NOVEL APPROACH FOR SEGMENTATION OF SECTOR SCAN SONAR IMAGES USING ADAPTIVE ...A NOVEL APPROACH FOR SEGMENTATION OF SECTOR SCAN SONAR IMAGES USING ADAPTIVE ...
A NOVEL APPROACH FOR SEGMENTATION OF SECTOR SCAN SONAR IMAGES USING ADAPTIVE ...
 
Call for Papers - International Journal of Information Sciences and Technique...
Call for Papers - International Journal of Information Sciences and Technique...Call for Papers - International Journal of Information Sciences and Technique...
Call for Papers - International Journal of Information Sciences and Technique...
 
Online Paper Submission - 4th International Conference on NLP & Data Mining (...
Online Paper Submission - 4th International Conference on NLP & Data Mining (...Online Paper Submission - 4th International Conference on NLP & Data Mining (...
Online Paper Submission - 4th International Conference on NLP & Data Mining (...
 
TOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWS
TOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWSTOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWS
TOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWS
 
Research Articles Submission - International Journal of Information Sciences ...
Research Articles Submission - International Journal of Information Sciences ...Research Articles Submission - International Journal of Information Sciences ...
Research Articles Submission - International Journal of Information Sciences ...
 
ANALYSIS OF IMAGE WATERMARKING USING LEAST SIGNIFICANT BIT ALGORITHM
ANALYSIS OF IMAGE WATERMARKING USING LEAST SIGNIFICANT BIT ALGORITHMANALYSIS OF IMAGE WATERMARKING USING LEAST SIGNIFICANT BIT ALGORITHM
ANALYSIS OF IMAGE WATERMARKING USING LEAST SIGNIFICANT BIT ALGORITHM
 
Call for Research Articles - 6th International Conference on Machine Learning...
Call for Research Articles - 6th International Conference on Machine Learning...Call for Research Articles - 6th International Conference on Machine Learning...
Call for Research Articles - 6th International Conference on Machine Learning...
 
Online Paper Submission - International Journal of Information Sciences and T...
Online Paper Submission - International Journal of Information Sciences and T...Online Paper Submission - International Journal of Information Sciences and T...
Online Paper Submission - International Journal of Information Sciences and T...
 
Colour Image Steganography Based on Pixel Value Differencing in Spatial Domain
Colour Image Steganography Based on Pixel Value Differencing in Spatial DomainColour Image Steganography Based on Pixel Value Differencing in Spatial Domain
Colour Image Steganography Based on Pixel Value Differencing in Spatial Domain
 
Call for Research Articles - 5th International conference on Advanced Natural...
Call for Research Articles - 5th International conference on Advanced Natural...Call for Research Articles - 5th International conference on Advanced Natural...
Call for Research Articles - 5th International conference on Advanced Natural...
 
International Journal of Information Sciences and Techniques (IJIST)
International Journal of Information Sciences and Techniques (IJIST)International Journal of Information Sciences and Techniques (IJIST)
International Journal of Information Sciences and Techniques (IJIST)
 
Research Article Submission - International Journal of Information Sciences a...
Research Article Submission - International Journal of Information Sciences a...Research Article Submission - International Journal of Information Sciences a...
Research Article Submission - International Journal of Information Sciences a...
 
Design and Implementation of LZW Data Compression Algorithm
Design and Implementation of LZW Data Compression AlgorithmDesign and Implementation of LZW Data Compression Algorithm
Design and Implementation of LZW Data Compression Algorithm
 
Online Paper Submission - 5th International Conference on Soft Computing, Art...
Online Paper Submission - 5th International Conference on Soft Computing, Art...Online Paper Submission - 5th International Conference on Soft Computing, Art...
Online Paper Submission - 5th International Conference on Soft Computing, Art...
 

Recently uploaded

UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)simmis5
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTbhaskargani46
 
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank  Design by Working Stress - IS Method.pdfIntze Overhead Water Tank  Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank Design by Working Stress - IS Method.pdfSuman Jyoti
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxfenichawla
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXssuser89054b
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapRishantSharmaFr
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...Call Girls in Nagpur High Profile
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...SUHANI PANDEY
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdfKamal Acharya
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 

Recently uploaded (20)

UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank  Design by Working Stress - IS Method.pdfIntze Overhead Water Tank  Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 

A PERSONALIZED WEB PAGE CONTENT FILTERING MODEL BASED ON SEGMENTATION

  • 1. International Journal of Information Sciences and Techniques (IJIST) Vol.2, No.1, January 2012 DOI : 10.5121/ijist.2012.2104 41 A PERSONALIZED WEB PAGE CONTENT FILTERING MODEL BASED ON SEGMENTATION K.S.Kuppusamy1 and G.Aghila2 1 Department of Computer Science, School of Engineering and Technology, Pondicherry University, Pondicherry, India kskuppu@gmail.com 2 Department of Computer Science, School of Engineering and Technology, Pondicherry University, Pondicherry, India aghilaa@yahoo.com ABSTRACT In the view of massive content explosion in World Wide Web through diverse sources, it has become mandatory to have content filtering tools. The filtering of contents of the web pages holds greater significance in cases of access by minor-age people. The traditional web page blocking systems goes by the Boolean methodology of either displaying the full page or blocking it completely. With the increased dynamism in the web pages, it has become a common phenomenon that different portions of the web page holds different types of content at different time instances. This paper proposes a model to block the contents at a fine-grained level i.e. instead of completely blocking the page it would be efficient to block only those segments which holds the contents to be blocked. The advantages of this method over the traditional methods are fine-graining level of blocking and automatic identification of portions of the page to be blocked. The experiments conducted on the proposed model indicate 88% of accuracy in filtering out the segments. KEYWORDS Content Filtering, Segmentation, Web Page Blocking 1. INTRODUCTION The World Wide Web (WWW) has become the biggest repository of information known to the mankind. Yet another aspect which makes the World Wide Web more powerful is the ease with which this largest repository can be accessed. With the prolific improvements in the Web Search Engine’s functionalities, the distance to any information available in the World Wide Web is “a single click”. Though it can be considered an advantage, it poses certain vulnerabilities as well. With the floodgates of information on World Wide Web open, the exposure to diverse information to all users has created the necessity for the some sort of filtering mechanisms to this information consumption process. This paper proposes a technique which would facilitate this filtering mechanism. The objectives of this paper are as listed below:
  • 2. International Journal of Information Sciences and Techniques (IJIST) Vol.2, No.1, January 2012 42 • Proposing a model for web page content filtering based on segmentation. • Incorporation of personalization in the proposed model to enhance the web content filtering process. The remainder of this paper is organized as follows: In Section 2, some of the related works carried out in this domain are explored. Section 3 deals with the proposed model’s mathematical representation and algorithms. Section 4 is about prototype implementation and experiments. Section 5 focuses on the conclusions and future directions for this research work. 2.RELATED WORKS This section would highlight the related works that have been carried out in this domain. The proposed model incorporates the following two major fields of study: • The Web Content Filtering • Web Page Segmentation 2.1 The Web Content Filtering Content Filtering Systems for web pages is an active research topic primarily due to following reasons: It protects users (especially minor-age people) from unwanted content; the resources on the network can be saved from unwanted usage like playing network games in an office network etc. There exist many approaches to Content Filtering Systems. Some of them are as listed below: • Rating Systems • Black Listing / White Listing • Keyword blocking • In Rating Systems users are asked to rate a web site for its content. This rating would be used as a tool for filtering [1]. The black listing / white listing maintains a set of URLs manually prepared for filtering. The problem with this approach is the scalability. There exist many tools available to perform content filtering using above specified methods [2], [3], [4]. The text classification based approach is explored in [5], [6]. The approach that has been chosen to facilitate filtering in this paper is a variation of keyword based blocking method. 2.2 Web Page Segmentation Web page segmentation is an active research topic in the information retrieval domain in which a wide range of experiments are conducted. Web page segmentation is the process of dividing a web page into smaller units based on various criteria. The following are four basic types of web page segmentation method: • Fixed length page segmentation • DOM based page segmentation • Vision based page segmentation • Combined / Hybrid method A comparative study among all these four types of segmentation is illustrated in [7]. Each of above mentioned segmentation methods have been studied in detail in the literature. Fixed length page segmentation is simple and less complex in terms of implementation but the major problem
  • 3. International Journal of Information Sciences and Techniques (IJIST) Vol.2, No.1, January 2012 43 with this approach is that it doesn’t consider any semantics of the page while segmenting. In DOM base page segmentation, the HTML tag tree’s Document Object Model would be used while segmenting. An arbitrary passages based approach is given in [8]. Vision based page segmentation (VIPS) is in parallel lines with the way, humans views a page. VIPS [9] is a popular segmentation algorithm which segments a page based on various visual features. Apart from the above mentioned segmentation methods a few novel approaches have been evolved during the last few years. An image processing based segmentation approach is illustrated in [10]. The segmentation process based text density of the contents is explained in [11]. The graph theory based approach to segmentation is presented in [12]. Repetition-based web page segmentation by detecting tag patterns for small-Screen Devices is explored in [13]. One of the approaches for web page segmentation for specific domains is detailed in [14]. A tree clustering based segmentation approach is provided in [15]. 3. THE MODEL This section elaborates about the mathematical model of the proposed system. The corresponding algorithm to carry out the task specified in the model is also explored in this section. The block diagram of the proposed model is as shown in Figure 1. It contains the following components: Page Segmentor: This component is responsible for segmenting the contents of the page in to logically relevant units. Personalizer: This component handles the personalization of filtering. The Personalizer holds the profile-bag which contains user preferences. Segment Filter: Segment filter is another component in the model which handles individual segments and decides whether this segment should be incorporated in the filtered page or not. 3.1 Mathematical Model In the proposed model each page that the user requests need to be segmented for filtration. Let us denote the source page by Φ . The source page Φ has to be segmented in to various logically coherent parts. The source page Φ would be mapped as a DOM (Document Object Model) tree. The individual nodes of the DOM tree are processed by parsing the tree. The “block level” and “non-block level” nodes are identified and they are used as the building block of the individual segments. The approach followed in this paper also incorporates the densitometry concepts in the segment building process. The densitometry considers the density of text present at a block unit in performing the segmentation process. As a result of the above mentioned process, the source page Φ is segmented in to various units as shown in (1). { } 1 2, 3... , n µ µ µ µ Φ = (1) The segmentation process shown in (1) is performed by the “Page Segmentor” component in the proposed model. After the completion of the segmentation process, each of the segments needs to be processed individually. This is performed by the “Segment Filter Component”. Each segment i µ is represented as a triple containing three components as shown listed below:
  • 4. International Journal of Information Sciences and Techniques (IJIST) Vol.2, No.1, January 2012 44 • Text • Link • Image Figure 1. Block Diagram of the Model The segment triple containing text, link and image is represented as shown in (2). { } , , i µ = Ψ Λ Θ (2) The triple i µ can be expanded as shown in (3).
  • 5. International Journal of Information Sciences and Techniques (IJIST) Vol.2, No.1, January 2012 45 [ ] [ ] [ ] 1 2, 1 2 1 2 , ... , ,... , ,... p i q i i r i η η η η κ κ κ κ µ λ λ λ λ   ∀ ∈ Ψ   ∀ ∈ Λ =     ∀ ∈ Θ   (3) In (3) [ ] 1 2, . , .. p η η η represent the text elements present in the segment under consideration; [ ] 1 2 , ,.. q κ κ κ represent the individual links presents in the segment and [ ] 1 2 , ... r λ λ λ represent the image elements present in the segments. The individual segments need to be processed for each of these three components to decide whether this segment can be allowed for display or it needs to be blocked. In order to perform this, segment filter component includes three sub-components a) Text Filter, b) Link Filter and c) Image Filter. The focus of this research work is on the effect of segmentation and personalization. The actual filtration process can be either simple keyword based or it can be customized according to the requirements of implementation. The proposed model incorporates personalization aspect. The user can configure the filter according to his/her requirements. The user preferences are represented using “Profile Bag”. The profile bag involves two different tracks. These tracks are “Like Track” and “Un-Like Track”. The block diagram of profile-bag is as shown in Figure 2. The figure consists of three horizontal layers. The top layer denotes the overall profile-bag. The middle one represents the “Like-Track” and “Un-Like Track”. The bottom layer in the Figure 2 denotes the keywords which form the “Like-Track” and “Un-Like Track”. The profile bag is represented in the model as Γ . The two different tracks of Γ are represented as shown in (4). ω σ Γ = (4) In (4) ω represent the “Like Track” and σ represent the “Un-Like Track” of the profile bag. Both ω and σ contains keywords that represent the user preferences. The keywords in ω adds a positive booster and the keywords in σ adds a negative booster. The filtration process can be represented as shown in (5). As a result of (5) the Text Weight, Link Weight and Image Weight are calculated as the sum of number of terms common between ω and elements giving a “+1” weight and number of terms common between σ and elements giving a “- 1” weight.
  • 6. International Journal of Information Sciences and Techniques (IJIST) Vol.2, No.1, January 2012 46 { } [ ] [ ] [ ] 1 2, 1 2 1 2 , ... , ,... , , , ,... p i q i r i η η η η κ κ κ κ λ λ λ λ   ∀ ∈ Ψ   Γ       Ψ Λ Θ = ∀ ∈ Λ   Γ       ∀ ∈ Θ   Γ   (5) If the sum of weights of all these three components exceeds a threshold level the segment is displayed otherwise it is blocked. ( ) : i i z if else µ δ µ µ   ∀ ∈Φ Ψ + Λ + Θ ≥ Φ∪   Φ =   Φ∪     (6) In (6), Φ represents the filtered page in which segments whose weight has been calculated above the threshold limit are incorporated. When the weight is less than the threshold then a dummy segment z µ holding the message “segment blocked” would be added to the page. Figure 2. The User Profile - Bag The dummy segment which would replace the filtered segment can be custom defined. The proposed model has another feature called “link hiding”. In the case of link hiding, if the content to be blocked is having a hyperlink, instead of removing the content, the hyperlink alone can be removed which creates the similar impact as removing the content. 3.2 The Algorithm The algorithmic representation of the steps involved in the above explained model is explored in this section.
  • 7. International Journal of Information Sciences and Techniques (IJIST) Vol.2, No.1, January 2012 47 In the algorithm, TF, LF and IF refers to text filter, link filter and image filter respectively. 4. EXPERIMENTS AND RESULT ANALYSIS The proposed model has been implemented as prototype for experimentation. The prototype implementation is done with the software stack including Linux, Apache, MySql and PHP. For client side scripting JavaScript is used. With respect to the hardware, a Core i3 processor system with 3 GHz of speed, 8 GB of RAM is used. The internet connection used in the experimental setup is a 128 Mbps leased line. The screenshots of the prototype implementation are as shown in the Figure 3 and Figure 4. The screenshot shown in Figure 3 is of the original source page. Algorithm SegmentFilter Input: Source Web Page Φ , profile bag Γ Output : Filtered Page Φ Begin Segment the source page using page segmentor { } 1 2, 3... , n µ µ µ µ Φ = Initialize Φ to NULL For each segment i µ begin Parse the segment 1 µ into components { } , , Ψ Λ Θ Calculate Text weight Ψ = TF ( Ψ / Γ ) Calculate Link Weight Λ = LF ( Λ / Γ ) Calculate Image Weight Θ = IF ( Θ / Γ ) If ( ) δ Ψ + Λ + Θ ≥ then Φ = Φ ∪ i µ Else Φ = Φ ∪ z µ End Return ( Φ ) End
  • 8. International Journal of Information Sciences and Techniques (IJIST) Vol.2, No.1, January 2012 48 Figure 3. The Source Page The page segments are filtered out based on the filtering preferences set up. The resultant page is as shown in Figure 4. Figure 4. The Page after filtering the unwanted segments. In Figure 4, it can be noted that the segments containing the terms “Entertainment Software” and “Games” are filtered out as per the filtering preferences set. The contents of Table 1 list out the experimental results conducted on the proposed content filtering model. In the Table 1, MSC indicates the mean segment count, MFSC stands mean filtered segment count, MFP is mean false positives and MFN is mean false negative.
  • 9. International Journal of Information Sciences and Techniques (IJIST) Vol.2, No.1, January 2012 49 Table 1. Experimental Results of the proposed model Session ID MSC MFSC MFP MFN Accuracy (%) 1 27.52 5.2 1.2 1.5 90.189 2 30.25 3.5 0.8 1.2 93.388 3 43.53 4.5 1.3 1.3 94.027 4 20.67 2.7 0.7 0.2 95.646 5 18.45 1.5 1.6 0.4 89.16 6 14.66 2.3 3.5 0.5 72.715 7 16.78 4.3 3.1 1.1 74.97 8 17.67 1.3 1.2 1.5 84.72 9 14.85 1.8 0.5 0.8 91.246 10 25.52 2.6 0.9 0.9 92.947 11 12.45 5.2 0.9 1.3 82.329 12 22.15 5.1 0.6 1.4 90.971 13 23.45 3.9 1.1 0.6 92.751 14 25.45 4.2 1.2 0.8 92.141 15 12.45 4.3 1.8 1.1 76.707 The chart in Figure 5 compares the average number of segments filtered out in a session, the false positives and the false negatives. It can be observed that the mean of MFSC across the session is 3.49, whereas the mean of MFP and MFN are 1.3 and 0.9 respectively. Figure 5. Comparison of MFSC, MFP and MFN The chart in Figure 6 compares the Mean Segment Count with the accuracy. It can be observed that the mean accuracy of filtering across the session is 87.59 which confirm the efficiency of the proposed content filtering model.
  • 10. International Journal of Information Sciences and Techniques (IJIST) Vol.2, No.1, January 2012 50 Figure 6. Comparison of MSC and Accuracy 5. CONCLUSIONS AND FUTURE DIRECTIONS The proposed model for page filtering using segmentation and personalization renders the following advantages: • Instead of blocking the entire page in cases where the content to be blocked is present only at a portion of the page, the proposed model provides a distinct benefit to user. • Incorporation of personalization in the blocking process provides a tailor made content filtering system based on the user’s needs. The future directions for this research work are as listed below: • In the proposed model the image filtering happens using the “alt” text provided with the image. In the future implementations some of the image analysis modules can be incorporated to make the image filtering much more efficient. • Incorporation of the capability to handle languages other than English would make the system more efficient in the cases of non-English web pages. REFERENCES [1] Paul Resnick and Jim Miller. PICS: Internet access controls without censorship. Communications of the AGM, 39(10):87-93, 1996. [2] Net Nanny, Available : http://wuw.netnanny.com [3] Cyber Patrol, Available : http://www.cyberpatrol.com/ [4] Cyber Sitter, Available: http://www.cybersitter.com [5] Du, R.; Safavi-Naini, R.; Susilo, W.; Web filtering using text classification, The 11th IEEE International Conference on Networks, 2003. ICON2003.pages:325 - 330
  • 11. International Journal of Information Sciences and Techniques (IJIST) Vol.2, No.1, January 2012 51 [6] Weiming Hu, Ou Wu, Zhouyao Chen, Zhouyu Fu, Maybank, S., "Recognition of Pornographic Web Pages by Classifying Texts and Images", Pattern Analysis and Machine Intelligence, IEEE Transactions on, On page(s): 1019 - 1034, Volume: 29 Issue: 6, June 2007. [7] Deng Cai, Shipeng Yu, Ji-Rong Wen, and Wei-Ying Ma. Block-based web search. In SIGIR ’04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pages 456–463, New York, NY, USA, 2004. ACM [8] Kaszkiel, M. and Zobel, J., Effective Ranking with Arbitrary Passages, Journal of the American Society for Information Science, Vol. 52, No. 4, 2001, pp. 344-364. [9] D. Cai, S. Yu, J. Wen, and W.-Y. Ma, VIPS: A vision-based page segmentation algorithm, Tech. Rep. MSR-TR-2003-79, 2003. [10] Cao, Jiuxin , Mao, Bo and Luo, Junzhou, 'A segmentation method for web page analysis using shrinking and dividing', International Journal of Parallel, Emergent and Distributed Systems, 25: 2, 93 — 104, 2010. [11] Kohlschütter, C. and Nejdl, W. A densitometric approach to web page segmentation. In Proceeding of the 17th ACM Conference on information and Knowledge Management (Napa Valley, California, USA, October 26 - 30, 2008). CIKM '08. ACM, New York, NY, 1173-1182, 2008. [12] Deepayan Chakrabarti , Ravi Kumar , Kunal Punera, A graph-theoretic approach to webpage segmentation, Proceeding of the 17th international conference on World Wide Web, April 21-25, Beijing, China, 2008. [13] Jinbeom Kang, Jaeyoung Yang, Joongmin Choi, “Repetition-based Web Page Segmentation by Detecting Tag Patterns for Small-Screen Devices”, IEEE Transactions on Consumer Electronics, IEEE, vol. 56, no. 2, pp.980-986, 2010. [14] Madaan Aastha, Chu Wanming, Author: Bhalla Subhash, "VisHue: Web Page Segmentation for an Improved Query Interface for MedlinePlus Medical Encyclopedia", Databases in Networked Information Systems, Springer Lecture Notes in Computer Science, 2011. [15] Xinyue Liu, Hongfei Lin, Ye Tian, Segmenting Webpage with Gomory-Hu Tree Based Clustering, Journal of Software, Vol 6, No 12 (2011), 2421-2425. Authors K.S.Kuppusamy is an Assistant Professor at Department of Computer Science, School of Engineering and Technology, Pondicherry University, Pondicherry, India. He has obtained his Masters degree in Computer Science and Information Technology from Madurai Kamaraj University. He is currently pursuing his Ph.D in the field of Intelligent Information Management. His research interest includes Web Search Engines, Semantic Web. He has made 8 international publications. G. Aghila is a Professor at Department of Computer Science, School of Engineering and Technology, Pondicherry University, Pondicherry, India. She has got a total of 22 years of teaching experience. She has received her M.E (Computer Science and Engineering) and Ph.D. from Anna University, Chennai, India. She has published more than 55 research papers in web crawlers, ontology based information retrieval. She is currently a supervisor guiding 8 Ph.D. scholars. She was in receipt of Schrneiger award. She is an expert in ontology development. Her area of interest includes Intelligent Information Management, artificial intelligence, text mining and semantic web technologies.