Linkator: enriching web pages by automatically adding dereferenceable semantic annotations<br />Samur Araujo, Geert-Jan Ho...
Summary – dereferencing semantic annotations<br />What dereferencing semantic annotations is about?<br />Automatic linking...
Motivation<br />Links between HTML pages are the main mechanism to navigate on web pages.<br />However, a lot of pages are...
Problem Statement <br />The problem of automatic linking can be divided in 3 sub-problems:<br />How to identify candidate ...
State-of-the-Art in Automatic Linking<br />Candidate Terms:<br />Focused on term disambiguation using an auxiliary knowled...
Linkator Approach	<br />Linkator<br />Extract Terms from Web Pages<br />Associate Terms to Concepts<br />Find Resources th...
Link Clicked<br />Page Accessed<br />Page is accessed<br />Annotated page<br />Term are extracted<br />Annotation is extra...
Linkator Approach	<br />Web Browser<br />Linkator Client - Firefox Plugin<br />Annotator<br />RDFa Annotator<br />Informat...
Semantic Link – Definition <br />A semantic link is an HTML tag A that is semantically annotated with RDFa.<br />It contai...
RDF Triples associated to the Semantic Link<br />Semantic Links<br />
Dereferencing Semantic Links<br />Linkator uses the Linked Data cloudfor discovering a destination for the semantic link a...
Endpoint Resolution<br /><ul><li>Task:Find endpoints that contain a specific concept.</li></ul>Linkatorselects available e...
Endpoint Resolution<br />Select the vocabulary of all RDF types associated with the annotation.<br />Or select the vocabul...
Endpoint Resolution<br />The SelectEndpoint function find the resource: http://ontoware.org/swrc/swrc_v0.3.owl#Author<br /...
Query Formulation<br />Query is based on the object of the triple.<br />Try to find a human-readable representation of the...
Proof of Concept<br />Semantic links for pages that contain bibliographic citations. <br />Extended version of FreeCite pa...
Linkator<br />Extract Terms from Web Pages<br />Associate Terms to Concepts<br />Find Resources that Represents these Conc...
Example – HTML Page without Links<br />
Example – Page annotated with RDFa<br />
Example – Page with Semantic Links<br />
Conclusion and Future Work<br />For a specific scenario of linking bibliographic citations Linkator provides a reasonable ...
Questions?<br />Thank you for your attention!<br />Samur Araujo<br />s.f.cardosodearaujo@tudelft.nl<br />You can download ...
Annotation on the page are used to find the link destination<br />Annotated HTML Page<br />HTML Page<br />Page is annotate...
State-of-the-Art in Automatic Linking<br />Example: <br />Wikify! [1] is focused on linking keywords on web pages to Wikip...
Endpoint Resolution<br />FUNCTION SelectEndpoint<br />	E := Array<br />	R : = select all rdf:type objects associated to th...
Semantic Link – Example<br />Triples associated with the semantic link.<br />
Upcoming SlideShare
Loading in …5
×

Linkator: enriching web pages by automatically adding dereferenceable semantic annotations

1,463 views

Published on

In this paper, we introduce Linkator, an application architecture that
exploits semantic annotations for automatically adding links to previously
generated web pages. Linkator provides a mechanism for dereferencing these
semantic annotations with what it calls semantic links. Automatically adding
links to web pages improves the users’ navigation. It connects the visited page
with external sources of information that the user can be interested in, but that
were not identified as such during the web page design phase. The process of
auto-linking encompasses: finding the terms to be linked and finding the
destination of the link. Linkator delegates the first stage to external semantic
annotation tools and it concentrates on the process of finding a relevant
resource to link to. In this paper, a use case is presented that shows how this
mechanism can support knowledge workers in finding publications during their
navigation on the web.

Published in: Technology
1 Comment
1 Like
Statistics
Notes
No Downloads
Views
Total views
1,463
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
16
Comments
1
Likes
1
Embeds 0
No embeds

No notes for slide
  • I am in the start phase of the phd research. In this presentation, I will outline the vision at the start of the phd period on the research problem which is building trust in web content and our approach to solving this problem. Also I will give a brief plan of my PhD research.
  • We focus on content trust and formulate our main research questions. The first key issue here is to investigate what kind factors that can influence trust in content.Following the first one, we also need to know how to capture and represent the information about these factors.The third key issue is how to assess or compute content trust based on the information we get from the second step. Ideally we want to have a trust value assigned to every piece of content. Different from the propagation of trust through the network of people, since we now have more information, and semantics about the content, we want to build metrics to assess the trustworthiness based content itself and the connection between different pieces of content, especially the semantic similarity and relation.
  • Linkator: enriching web pages by automatically adding dereferenceable semantic annotations

    1. 1. Linkator: enriching web pages by automatically adding dereferenceable semantic annotations<br />Samur Araujo, Geert-Jan Houben, Daniel Schwabe<br />Web Information Systems<br />Delft University of Technology, the Netherlands<br />
    2. 2. Summary – dereferencing semantic annotations<br />What dereferencing semantic annotations is about?<br />Automatic linking web pages.<br />Summary<br />Overview of the problem and motivation.<br />Our approach for solving the problem.<br />One example of use.<br />
    3. 3. Motivation<br />Links between HTML pages are the main mechanism to navigate on web pages.<br />However, a lot of pages are unlinked or poorly linked.<br />Terms on pages have meaning and are intrinsically associated to concepts or entities that the user is interested in.<br />These terms can be interpreted by machines and automatically linked to relevant resources on the web. <br />
    4. 4.
    5. 5. Problem Statement <br />The problem of automatic linking can be divided in 3 sub-problems:<br />How to identify candidate terms (anchors) for adding links?<br />It denotes concepts in which the user is interested. <br />Which concept does a candidate term represent?<br />Disambiguate a candidate term.<br />How to identify a web resource to be the link target?<br />How to select a source of data for finding the destination of the link?<br />
    6. 6. State-of-the-Art in Automatic Linking<br />Candidate Terms:<br />Focused on term disambiguation using an auxiliary knowledge base or dictionaries (e.g. wikipedia and wordnet).<br />Link Target:<br />It is selected from a specific knowledge base [1] or from a collection [2] of target documents. <br />Limitations<br />Does not support well users interested in a broader range of domains.<br />[1] Mihalcea, R. and Csomai, A. Wikify!: linking documents to encyclopedic knowledge. In Proceedings of the 16th ACM Conference on Information and Knowledge management (CIKM 07), Lisbon, Portugal, pp. 233-242, 2007.<br />[2] Gardner JJ, Krowne A, Xiong L. NNexus: An Automatic Linker for Collaborative Web-Based Corpora. IEEE Trans. Knowl. Data Eng. 21(6). 829-839. 2009. <br />
    7. 7. Linkator Approach <br />Linkator<br />Extract Terms from Web Pages<br />Associate Terms to Concepts<br />Find Resources that Represents these Concepts<br />Core Linkator<br />Information Extraction Engine<br />Semantic Annotator<br />
    8. 8. Link Clicked<br />Page Accessed<br />Page is accessed<br />Annotated page<br />Term are extracted<br />Annotation is extracted<br />Page is semantically annotated<br />Endpoint is chosen<br />Semantic Links created<br />Query is formulated<br />If not<br />found<br />Search for a resource<br />
    9. 9. Linkator Approach <br />Web Browser<br />Linkator Client - Firefox Plugin<br />Annotator<br />RDFa Annotator<br />Information Extraction Engine<br />HTTP<br />HTTP<br />Linkator Server<br />Linked <br />Data<br />Endpoint Resolution<br />Sparql<br />Query Formulation<br />
    10. 10. Semantic Link – Definition <br />A semantic link is an HTML tag A that is semantically annotated with RDFa.<br />It contains RDF triples associated to it.<br />Semantic Link causes a query over Linked Data.<br />
    11. 11. RDF Triples associated to the Semantic Link<br />Semantic Links<br />
    12. 12. Dereferencing Semantic Links<br />Linkator uses the Linked Data cloudfor discovering a destination for the semantic link as opposed to querying search engines or a fixed knowledge base.<br />Algorithm for Endpoint Resolution<br />Algorithm for Query Formulation<br />
    13. 13. Endpoint Resolution<br /><ul><li>Task:Find endpoints that contain a specific concept.</li></ul>Linkatorselects available endpoints based on the vocabulariesused in the semantic links. voiD(Vocabulary of Interlinked Datasets) <br />
    14. 14. Endpoint Resolution<br />Select the vocabulary of all RDF types associated with the annotation.<br />Or select the vocabularies of all predicates associated with the annotation.<br />
    15. 15. Endpoint Resolution<br />The SelectEndpoint function find the resource: http://ontoware.org/swrc/swrc_v0.3.owl#Author<br />It extracts the vocabulary associated with this resource:http://ontoware.org/swrc/swrc_v0.3.owl<br />It queries the voiDdescriptor of the available SPARQL endpoints, looking for such a vocabulary.<br />
    16. 16. Query Formulation<br />Query is based on the object of the triple.<br />Try to find a human-readable representation of the resource, i.e., try to match predicates such as: foaf:homepage, akt:has-web-address, rdfs:seeAlso. <br />
    17. 17. Proof of Concept<br />Semantic links for pages that contain bibliographic citations. <br />Extended version of FreeCite parsing engine.<br />Example of bibliographic citation:<br />Keesvan derSluijs, Geert-Jan Houben, Erwin Leonardi, Jan Hidders. Hera: Engineering Web Applications Using Semantic Web-Based Models. Book chapter: Semantic Web Information Management: A Model-Based Perspective, De Virgilio, Roberto; Giunchiglia, Fausto; Tanca, Letizia (Eds.), Chapter 22, 2010, Springer.<br />
    18. 18. Linkator<br />Extract Terms from Web Pages<br />Associate Terms to Concepts<br />Find Resources that Represents these Concepts<br />Core Linkator<br />Information Extraction Engine<br />Semantic Annotator<br />Html Page<br />Sparql Endpoint Discovering and Selection<br />Markup<br />Removed<br />Entity<br /> Extraction <br />Plain Text<br />Text Semantically Annotated<br />Endpoint<br /> Querying<br />Semantic link clicked<br />Semantic Annotation<br />Insert annotations <br />on the page<br />HTML Page Semantically Annotated<br />URL Generation<br />FreeCite Extraction Engine<br />Core Linkator<br />
    19. 19. Example – HTML Page without Links<br />
    20. 20.
    21. 21. Example – Page annotated with RDFa<br />
    22. 22. Example – Page with Semantic Links<br />
    23. 23.
    24. 24.
    25. 25. Conclusion and Future Work<br />For a specific scenario of linking bibliographic citations Linkator provides a reasonable solution. <br />The composition of the Semantic Web technologies can provide a reasonable solution for the problem of automatic linking.<br />Linkator is a concrete application that uses Semantic Web technologies.<br />Future Work: <br />Use Linkator in a broader scenario.<br />Enhance the Linkator algorithms.<br />Evaluate the precision and recall of the linking.<br />
    26. 26. Questions?<br />Thank you for your attention!<br />Samur Araujo<br />s.f.cardosodearaujo@tudelft.nl<br />You can download Linkator at:<br />http://www.wis.ewi.tudelft.nl/<br />
    27. 27. Annotation on the page are used to find the link destination<br />Annotated HTML Page<br />HTML Page<br />Page is annotated<br />Link is clicked<br />RDF<br />
    28. 28. State-of-the-Art in Automatic Linking<br />Example: <br />Wikify! [1] is focused on linking keywords on web pages to Wikipedia articles<br />Nnexus [2] focus on linking keywords obtained from an index extracted from target documents. <br />[1] Mihalcea, R. and Csomai, A. Wikify!: linking documents to encyclopedic knowledge. In Proceedings of the 16th ACM Conference on Information and Knowledge management (CIKM 07), Lisbon, Portugal, pp. 233-242, 2007.<br />[2] Gardner JJ, Krowne A, Xiong L. NNexus: An Automatic Linker for Collaborative Web-Based Corpora. IEEE Trans. Knowl. Data Eng. 21(6). 829-839. 2009. <br />
    29. 29. Endpoint Resolution<br />FUNCTION SelectEndpoint<br /> E := Array<br /> R : = select all rdf:type objects associated to the semantic link<br /> T := ExtractVocabulary(R)<br />FOR EACH vocabulary in T DO<br />{<br />E.add (select endpoints that contain this vocabulary)<br /> }<br />IF E = Empty <br /> {<br /> R := select all predicates associated to the semantic link<br /> T := ExtractVocabulary(R)<br />FOR EACH vocabulary in T DO<br /> {<br />E.add (select endpoints that contain this vocabulary)<br /> }<br /> }<br />RETURN E <br />FUNCTION ExtractVocabulary(R)<br /> V := Array<br />FOR EACH resource in R DO<br /> {<br />V.add (extract the vocabulary from the resource)<br /> }<br />RETURN V<br />1<br />2<br />3<br />4<br />5<br />6<br />7<br />8<br />9<br />10<br />11<br />12<br />13<br />14<br />15<br />16<br />17<br />18<br />19<br />20<br />21<br />22<br />23<br />24<br />25<br />26<br />27<br />28<br />
    30. 30. Semantic Link – Example<br />Triples associated with the semantic link.<br />

    ×