The document presents a method for automatically detecting the roles of visual elements on web pages. It uses an ontology to represent roles and their properties, and a rule-based system to assign roles based on visual element properties. An evaluation showed the system could accurately detect roles for over 80% of elements on test pages, with performance varying based on page complexity. The authors conclude the ontology and heuristic-based approach is adaptable and the knowledge base can be modified for different purposes. Future work is planned to improve the knowledge base and implement the system as a web service.
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Heuristic Role Detection of Visual Elements of Web Pages
1. Introduction
Ontology Based Heuristic Role Detection
Evaluation
Conclusion
Heuristic Role Detection of
Visual Elements of Web Pages
M. Elgin Akpnar1
Yeliz Ye³ilada2
1
elgin.akpinar@metu.edu.tr, Middle East Technical University, Ankara, Turkey
2
yyeliz@metu.edu.tr, Middle East Technical University
Northern Cyprus Campus, Kalkanl, Güzelyurt,
Mersin 10, Turkey
ICWE, 2013
M. Elgin Akpnar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
2. Introduction
Ontology Based Heuristic Role Detection
Evaluation
Conclusion
Outline
1 Introduction
2 Ontology Based Heuristic Role Detection
3 Evaluation
4 Conclusion
M. Elgin Akpnar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
3. Introduction
Ontology Based Heuristic Role Detection
Evaluation
Conclusion
Motivation
Related Work
Problem Denition
Accessibility issues in interactive web
pages
Problems with accessing in alternative
forms such as audio with assistive
technologies
Problems with mobile devices
Screen size problems
Limited resources
M. Elgin Akpnar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
4. Introduction
Ontology Based Heuristic Role Detection
Evaluation
Conclusion
Motivation
Related Work
Problem Deniton (cont.)
Compatibility issues
Development of new web technologies
Dynamic web content, HTML5, etc.
Flexible syntax of HTML and CSS
Ability to create the same visual layout
with dierent underlying coding
Inability to fully describe web elements
M. Elgin Akpnar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
5. Introduction
Ontology Based Heuristic Role Detection
Evaluation
Conclusion
Motivation
Related Work
Requirements
Propose a method to automatically identify visual elements in web
pages;
Serving dierent purposes
Providing better accessibility for disabled people and mobile
devices
Improving the accuracy of information retrieval and data
mining applications
Transcoding or reorganising web page structure for better
presentation
Adapting to new technologies
M. Elgin Akpnar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
6. Introduction
Ontology Based Heuristic Role Detection
Evaluation
Conclusion
Motivation
Related Work
Recent Application Fields
Web page adaptation for small screen devices
[Yin Lee, 2005, Ahmadi Kong, 2008, Chen et al., 2005,
Xiao et al., 2008, Chen et al., 2001]
Intelligent user interface creation [Xiang Shi, 2006]
Information retrieval and web data mining
[Kovacevic et al., 2002, Lin Ho, 2002, Liu et al., 2003,
Yi et al., 2003]
Web accessibility [Takagi et al., 2002]
M. Elgin Akpnar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
7. Introduction
Ontology Based Heuristic Role Detection
Evaluation
Conclusion
Motivation
Related Work
Drawbacks
Simplistic sets of roles
Narrow understanding of web page elements
Inability to describe a web page semantically
Static denition of roles and attributes
Maintenance problems
M. Elgin Akpnar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
8. Introduction
Ontology Based Heuristic Role Detection
Evaluation
Conclusion
Visual Element Identier
Rule Generator
Role Detector
System Architecture
M. Elgin Akpnar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
9. Introduction
Ontology Based Heuristic Role Detection
Evaluation
Conclusion
Visual Element Identier
Rule Generator
Role Detector
Vision Based Page Segmentation Algorithm (VIPS)
Aims to extract the block structure by using some visual cues
and tag properties of the nodes.
Visual Cues: Tag, color, text and size of a node
[Cai et al., 2003]
M. Elgin Akpnar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
10. Introduction
Ontology Based Heuristic Role Detection
Evaluation
Conclusion
Visual Element Identier
Rule Generator
Role Detector
System Architecture
M. Elgin Akpnar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
11. Introduction
Ontology Based Heuristic Role Detection
Evaluation
Conclusion
Visual Element Identier
Rule Generator
Role Detector
Knowledge Representation
Systematic characterisation of roles of visual elements
Denition of properties which aect how visual elements are
used and presented
Visual styles, specic keywords, relation between parent and
children elements
eMine Ontology
Based on WAfA Ontology [Harper Yesilada, 2007]
Iterative knowledge base construction:
Comparison with ARIA Ontology [Craig Cooper, 2010]
Factor annotations
Object property classication
M. Elgin Akpnar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
12. Introduction
Ontology Based Heuristic Role Detection
Evaluation
Conclusion
Visual Element Identier
Rule Generator
Role Detector
An object property for Header role
...
owl:Restriction
owl:onProperty rdf:resource=emine#has_tag /
owl:allValuesFrom
owl:Class
owl:oneOf rdf:parseType=Collection
owl:Thing rdf:about=emine#Header /
owl:Thing rdf:about=emine#Div /
/owl:oneOf
/owl:Class
/owl:allValuesFrom
/owl:Restriction
...
M. Elgin Akpnar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
13. Introduction
Ontology Based Heuristic Role Detection
Evaluation
Conclusion
Visual Element Identier
Rule Generator
Role Detector
System Architecture
M. Elgin Akpnar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
14. Introduction
Ontology Based Heuristic Role Detection
Evaluation
Conclusion
Visual Element Identier
Rule Generator
Role Detector
Role Detector
Jess, a Java based rule engine and scripting environment
Initial state: a set of rules, which are converted from eMine
Ontology and a tree of unlabeled visual elements
Process of role detection:
1 Rule engine object construction
2 Load of template denitions and initial variables
3 Assertion of facts (properties of visual elements)
4 Firing of predened rules over visual elements
Final state: a tree of labeled visual elements
M. Elgin Akpnar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
15. Introduction
Ontology Based Heuristic Role Detection
Evaluation
Conclusion
Visual Element Identier
Rule Generator
Role Detector
Jess rules for Header role
...
(defrule Header06 (block (has_tag $? /.*header.*/ $?))
=
(bind ?*Header* (+ 2 ?*Header*)))
...
(defrule Header07 (block (has_tag $? /.*div.*/ $?))
=
(bind ?*Header* (+ 2 ?*Header*)))
...
M. Elgin Akpnar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
16. Introduction
Ontology Based Heuristic Role Detection
Evaluation
Conclusion
Visual Element Identier
Rule Generator
Role Detector
Labeled Block Structure
M. Elgin Akpnar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
17. Introduction
Ontology Based Heuristic Role Detection
Evaluation
Conclusion
Evaluation
Results
Evaluation
User Evaluation
Online survey based evaluation
Given a list of roles, participants were asked to assign a role to
given visual blocks
Nine randomly chosen web pages from a group of 30 pages
25 participants evaluated
Technical Evaluation
Technical feasibility of the proposed approach and its
implementation
M. Elgin Akpnar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
18. Introduction
Ontology Based Heuristic Role Detection
Evaluation
Conclusion
Evaluation
Results
User Evaluation Results
Complexity
Group
System-Expert
Evaluation
Receptive
Evaluation
Block
Count
Low 79.82 % 73.68 % 65
Medium 88.28 % 79.77 % 237
High 88.47 % 85.53 % 569
Overall 86.83 % 80.82 % 298
M. Elgin Akpnar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
19. Introduction
Ontology Based Heuristic Role Detection
Evaluation
Conclusion
Evaluation
Results
Technical Evaluation Results
Complexity
Group
Total
Memory
Total
Time
Avr. Memory
per Block
Avr. Time
per Block
Block
Count
Low 8,369 KB 6,576 ms 244.29 KB 102.29 ms 65
Medium 7,013 KB 23,799 ms 36.44 KB 102.12 ms 237
High 9,165 KB 54,837 ms 34.28 KB 101.95 ms 569
Overall 8,176 KB 29,157 ms 100.20 KB 102.11 ms 298
M. Elgin Akpnar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
20. Introduction
Ontology Based Heuristic Role Detection
Evaluation
Conclusion
Conclusion
Ontology based heuristic approach
Probabilistic model
Automatic identication and classication of web elements
Visual element identier
Knowledge base
Heuristic role detector
Adaptable to dierent domains, purposes and requirements
Modiable knowledge base
M. Elgin Akpnar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
21. Introduction
Ontology Based Heuristic Role Detection
Evaluation
Conclusion
Future Work
Improvements to our system
Knowledge base improvement
Web service implementation
Reengineering web pages
M. Elgin Akpnar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
22. Introduction
Ontology Based Heuristic Role Detection
Evaluation
Conclusion
Thank you for listening!
For further information
Contact: elgin.akpinar@metu.edu.tr
Project Page: http://emine.ncc.metu.edu.tr/
1
Thanks to
M. Elgin Akpnar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
23. References
Ahmadi, H. Kong, J. (2008).
Ecient web browsing on small screens.
In Proceedings of the working conference on Advanced visual
interfaces (pp. 2330).: ACM.
Cai, D., Yu, S., Wen, J. R., Ma, W. Y. (2003).
Vips: a vision based page segmentation algorithm.
Technical Report MSR-TR-2003-79, Microsoft Research.
Chen, J., Zhou, B., Shi, J., Zhang, H., Fengwu, Q. (2001).
Function-based object model towards website adaptation.
In WWW '01 (pp. 587596).: ACM.
Chen, Y., Xie, X., Ma, W.-Y., Zhang, H.-J. (2005).
Adapting web pages for small-screen devices.
IEEE Internet Computing, 9(1), 5056.
M. Elgin Akpnar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
24. References
Craig, J. Cooper, M. (2010).
Accessible rich internet applications (WAI-ARIA) 1.0.
http://www.w3.org/TR/2010/WD-wai-aria-20100916/com-
plete.
retrieved on 15.01.2013.
Harper, S. Yesilada, Y. (2007).
Web authoring for accessibility (WAfA).
Journal of Web Semantics (JWS), 5(3), 175179.
Kovacevic, M., Diligenti, M., Gori, M., Milutinovic, V.
(2002).
Recognition of common areas in a web page using visual
information: a possible application in a page classication.
M. Elgin Akpnar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
25. References
In Proceedings 2002 IEEE International Conference on Data
Mining (pp. 250257). Washington, DC, USA: IEEE Computer
Society.
Lin, S.-H. Ho, J.-M. (2002).
Discovering informative content blocks from web documents.
In KDD '02 (pp. 588593).: ACM.
Liu, B., Chin, C. W., Ng, H. T. (2003).
Mining topic-specic concepts and denitions on the web.
In WWW '03 (pp. 251260).: ACM.
Takagi, H., Asakawa, C., Fukuda, K., Maeda, J. (2002).
Site-wide annotation: reconstructing existing pages to be
accessible.
In ASSETS '02 (pp. 8188).: ACM.
M. Elgin Akpnar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
26. References
Xiang, P. Shi, Y. (2006).
Recovering semantic relations from web pages based on visual
cues.
In IUI '06 (pp. 342344).: ACM.
Xiao, Y., Tao, Y., Li, W. (2008).
A dynamic web page adaptation for mobile device based on
web2.0.
In Proceedings of the 2008 Advanced Software Engineering and
Its Applications (pp. 119122). USA: IEEE Computer Society.
Yi, L., Liu, B., Li, X. (2003).
Eliminating noisy information in web pages for data mining.
In KDD '03 (pp. 296305).: ACM.
M. Elgin Akpnar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
27. References
Yin, X. Lee, W. S. (2005).
Understanding the function of web elements for mobile content
delivery using random walk models.
In WWW '05 (pp. 11501151).: ACM.
M. Elgin Akpnar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages