XSLT+SPARQL: Scripting the
  Semantic Web with SPARQL
embedded into XSLT stylesheets
   Diego Berrueta, Jose E. Labra and ...
Outline

1. Introduction
2. Description of XSLT+SPARQL
3. Implementation and examples
4. Conclusions


                   ...
The problem

Many applications need to transform
RDF into markup (e.g. XHTML)
In the XML world, we have XSLT (XML
transfor...
Why I cannot use XSLT?
RDF/XML is complex, messy,
cumbersome ⇒ non-standard XML
serializations are required
XPath expressi...
RDF is not XML, therefore...
RDF is not XML, therefore...
  I will not use XSLT to transform RDF/XML
  I will not use XSLT to transform RDF/XML
  I wil...
RDF is not XML, therefore...
  I will not use XSLT to transform RDF/XML
  I will not use XSLT to transform RDF/XML
  I wil...
Related work
TriX, RDFXSLT: “alternative” RDF
syntaxes in XML
RDF Twig, TreeHugger: add intelligence
to XSLT processors
To...
Scripting RDF transformations

   Many scripting languages have RDF
   APIs, but they’re not standard
   Coding transforma...
XSLT+SPARQL
Two sets of XPath functions to query
RDF models using SPARQL
Intended to be used in @select
expressions of XSL...
Query results XML syntax
<results>
   <result>
     <binding   name=quot;fnamequot;><value>Tom</value>/binding>
     <bind...
Basic functions
sparql:sparql(query [, documentUrl, ...])
sparql:sparqlEndpoint(query, endpointUrl)



Execute SPARQL quer...
Advanced functions (I)
Retrieving and parsing RDF data is
expensive
Advanced functions designed for:
‣   Efficently query ...
Advanced functions (II)
sparql’:parseString(string, syntax)



Creates a model by parsing a string
containg serialized RDF...
Advanced functions (III)
sparql’:readModel(documentUrl, syntax)
sparql’:readModel(nodeset)

Create models by parsing:
1. A...
Advanced functions (IV)
sparql’:mergeModels(model1, model2, ...)



Combines two or more RDF models into a
new one




   ...
Advanced functions (V)
sparql’:sparqlModel(query, model)



Executes a SPARQL query against an
in-memory model




       ...
Two implementations
1. Java-based extension to Apache Xalan
   using Jena
     Exploits the extensibility of XSLT
2. Pure ...
Applications of XSLT+SPARQL
   Transform RDF data for presentation in
   XHTML, SVG...
   Generate reports beyond SPARQL
 ...
Example: Ivan’s acquittances
  <xsl:template match=quot;/quot;>
    <xsl:apply-templates
      select=quot;sparql:sparql(c...
Advanced examples

Querying DBPedia endpoint
HTML displays of SKOS thesauri
  Alphabetic display
  Systematic display (tre...
Conclusions and future work

  XSLT+SPARQL overcomes the
  limitations of XSLT to process RDF
  XSLT+SPARQL can be used to...
Thank you for your
    attention
 diego.berrueta@fundacionctic.org
Upcoming SlideShare
Loading in...5
×

XSLT+SPARQL: Scripting the Semantic Web with SPARQL embedded into XSLT stylesheets

2,488

Published on

Slides presented in the SFSW2008 workshop at Tenerife (co-located with ESWC2008). 02 Jun 2008.

Published in: Business, Technology
1 Comment
7 Likes
Statistics
Notes
No Downloads
Views
Total Views
2,488
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
1
Likes
7
Embeds 0
No embeds

No notes for slide
  • Good morning. My name is Diego Berrueta and I&#x2019;ll present an idea to extend XSLT with some functions that allow developers to query RDF graphs using SPARQL.
  • These are the main points that will be covered in this presentation. I&#x2019;ll start by introducing the motivation for this work.
  • We have observed that a number of semantic web applications need a means to transform RDF data to other formats. The semantic web is all about finding RDF data from the web, combining it, querying it, reasoning with it, but at the end, the data must be presented to the user. In the XML world, there a wonderful technology called XSLT (or &#x201C;XML Transformations&#x201D;), which is a W3C standard, and that is designed to take some data in XML and to transform it. The result is usually other XML file, but it can also be plain text. But when it turns to RDF, which is the equivalent to XSLT?
  • Hey, wait a moment! Some of you may think that we don&#x2019;t need a technology equivalent to XSLT in the RDF world because we can write RDF in XML. That&#x2019;s the purpose of the RDF/XML syntax, isn&#x2019;t it? If we want to transform RDF data, we can serialize it as RDF/XML and apply XSLT to it. Well, there are some problems with this idea. In the first place, the RDF/XML syntax is very complex: there are a lot of different ways to serialize even the simplest RDF graph. As XSLT is syntax-driven, this means that is incredibly difficult to write a XSLT stylesheet to transform data in RDF/XML. In the second place, XSLT internally uses XPath to select fragments of the input data. But XPath was designed to work on tree structures, not in graphs. Actually, what we have here is an impedance mismatch between a tool that was designed to work on data structured as trees, and the RDF data model, which is a more general data structure, a graph. Therefore, let me emphasize this point:
  • ... even if there is a syntax to serialize RDF as XML, this doesn&#x2019;t mean that RDF *is* XML. It is not. Therefore, although at a first glance it may seem feasible, in practice, we cannot use XML tools for RDF data. In particular, XSLT is not a good option to transform RDF graphs.
  • ... even if there is a syntax to serialize RDF as XML, this doesn&#x2019;t mean that RDF *is* XML. It is not. Therefore, although at a first glance it may seem feasible, in practice, we cannot use XML tools for RDF data. In particular, XSLT is not a good option to transform RDF graphs.
  • Now, let&#x2019;s take a look at how others are tackling this problem. There are some proposals to create other serialization syntaxes for RDF in XML which are simpler than RDF/XML. Unfortunately, they are not standard, and they are very verbose. Other approach is to add some intelligence to the XSLT processors, so when they evaluate an XPath expression against an RDF graph, they flatten the graph to a dynamic tree. The third proposal here is the closest to our work, but it has become obsolete by SPARQL. Finally, Axel Polleres and others have recently proposed a clever method to unify SPARQL and XQuery in a single language, and they will present their work tomorrow in the main track, so if you&#x2019;re interested, I recommend you to attend to their presentation. By the way, their paper is one of the candidates for the best paper award.
  • Finally, there is another way to transform the RDF data. We can write the transformation logic using our favorite scripting language: PHP, Python, whatever. But this way is not without problems. On the one hand, there is not an standarized API to access RDF data yet. Compare this with the situation in the XML world, where they have DOM and SAX. On the other hand, codifying transformation logic in conventional scripting languages usually leads to messy programs. This is one of the reasons that make XSLT so popular in the XML world. Most people don&#x2019;t want to write transformation logic in Java or Python.
  • Therefore, let me introduce you to XSLT+SPARQL. The idea is quite simple: we defined two small sets of XPath extension functions that allow to make queries against RDF models using SPARQL. These functions are intended to be used in the &#x201C;select&#x201D; attributes of XSLT stylesheets. In this way, instead of selecting fragments of the XML input data, the developer can select fragments of an RDF graph. These functions return very simple XML documents that contain the result of the SPARQL query. For this purpose, we use...
  • ... the W3C has defined a very simple XML syntax. The result of a SELECT query in SPARQL is a table with bindings for the query variables. In this syntax, each row of the table produces a &#x201C;result&#x201D; element, and each column produces a variable binding. In summary, we took three W3C technologies, namely XSLT, SPARQL, and the XML syntax for SPARQL results; and we defined two sets of functions that allow to bridge between RDF and XML.
  • As this is a workshop of developers, I&#x2019;ll spend some time describing these functions, that is, I&#x2019;ll explain the XSLT+SPARQL API. The first set of functions, which we call &#x201C;basic functions&#x201D;, contains just these two functions. They allow the developer to execute a query and they return the results. The difference between the first one and the second one is that the first one retrieves the RDF data from anywhere in the web and executes the query locally, while the second one uses a SPARQL endpoint to execute the query remotely. The first one can be used, for instance, to query the contents of a FOAF file, while the second one can be used against the DBPedia endpoint.
  • The second set of functions are the advanced ones. They allow to be write more efficient programs by avoiding parsing repeatedly the same RDF graph, and they also allow to build custom RDF models by merging information from different sources.
  • The first function of the advanced set can parse a string that contains serialized RDF data. This function does not return the results of any query, but a handler to an in-memory model.
  • There are two other functions to read RDF data and create handlers. The one that receives a URL fetchs the document and parse it as RDF. The last one parses a XSLT nodeset as if it was RDF/XML. This nodeset can be a fragment of the XSLT input document, but it can even be a part of the XSLT stylesheet.
  • This fourth function has the ability to merge two or more in-memory RDF models, identified by their handlers, into a new one. It allows the developer to build custom RDF models by picking and joining different pieces, which are parsed with the three previous functions.
  • Finally, the last function executes a SPARQL query against an in-memory RDF model. Note that this function does not parse any file, nor it retrieves data from the web. Therefore, it is much quicker than the two query functions we described in the basic set of functions.
  • How can you use these functions? We have two implementations. The first one is written in Java, and it uses the extensibility mechanism supported by the XSLT language, namely, the ability to define new user functions in a new namespace. Our code uses the Jena library to load and query RDF, and it is specific to Apache Xalan (which is an XSLT processor), although it should be easily portable to other XSLT processors. In parallel, we have a partial implementation of the basic functions written in pure XSLT. The main limitation of this portable implementation is that the document() function of XSLT lacks the ability to perform content negotiation.
  • The main application of XSLT+SPARQL is the transformation of RDF data to other formats, mainly to XML, and in particular, to presentational formats such a XHTML and SVG. In such role, XSLT+SPARQL works as the last step of a Semantic Web application. But there are other possibilities. For instance, the current version of SPARQL is somewhat limited with respect to its ability to generate reports, specially if you compare it with SQL. With XSLT+SPARQL, however, it is possible to do aggregation functions and grouping of the results of a query. Finally, it also possible to use XSLT+SPARQL as a language to implement some simple scripts for the semantic web.
  • In this example, we embed a SPARQL query within an XSLT stylesheet. This query simply fetches the FOAF file of Ivan Herman and returns the list of his friends, possibly with their mailbox and the URI of their webpage. The results are inside of a select attribute in an apply-templates element, so the XSLT processor will search for a template that matches the root element of the results. We can write one and continue the processing of the query results, for instance, to render an HTML table with the information.
  • This is a simple example, but we have some others which are more complex and can provide a better insight of the features of XSLT+SPARQL. For instance, we created one that does something similar to the example of Ivan&#x2019;s friends, but it uses the DBPedia SPARQL endpoint to get any kind of data, for instance, a list of German cities. Other examples can produce two kinds of ISO standarized displays for thesauri from SKOS data. One of them is simply an alphabetical listing, but the second one is a hierarchal display that looks like a tree. To build it, we simply used recursive XSLT templates. Finally, we wrote an &#x201C;spider&#x201D; agent that has the ability to retrieve data from the web on-demand. For instance, if Ivan&#x2019;s FOAF file does not contain the name of his friends, the script can de-reference the URIs and progressively build a richer RDF model.
  • We can summarize two conclusions. Firstly, our work can overcome the limitations of XSLT to process RDF data, regardless of its serialization format. Note that we can address RDF data from XSLT+SPARQL even if it is available in other syntaxes, such as N3. Secondly, these functions give you, the developers of scripts for the semantic web, a new language to write such scripts. And this platform still has the potential to grow, for instance, by introducing functions to perform reasoning and inference.


  • Transcript of "XSLT+SPARQL: Scripting the Semantic Web with SPARQL embedded into XSLT stylesheets"

    1. 1. XSLT+SPARQL: Scripting the Semantic Web with SPARQL embedded into XSLT stylesheets Diego Berrueta, Jose E. Labra and Ivan Herman CTIC Foundation / Universidad de Oviedo / CWI Scripting for the Semantic Web (SFSW’08) Tenerife, 02/Jun/2008
    2. 2. Outline 1. Introduction 2. Description of XSLT+SPARQL 3. Implementation and examples 4. Conclusions 2
    3. 3. The problem Many applications need to transform RDF into markup (e.g. XHTML) In the XML world, we have XSLT (XML transformations) ... what do we have in the RDF world? 3
    4. 4. Why I cannot use XSLT? RDF/XML is complex, messy, cumbersome ⇒ non-standard XML serializations are required XPath expressions (and patterns) are designed for trees, not graphs XSLT is designed for a different data model 4
    5. 5. RDF is not XML, therefore...
    6. 6. RDF is not XML, therefore... I will not use XSLT to transform RDF/XML I will not use XSLT to transform RDF/XML I will not use XSLT to transform RDF/XML I will not use XSLT to transform RDF/XML I will not use XSLT to transform RDF/XML I will not use XSLT to transform RDF/XML I will not use XSLT to transform RDF/XML I will not use XSLT to transform RDF/XML
    7. 7. RDF is not XML, therefore... I will not use XSLT to transform RDF/XML I will not use XSLT to transform RDF/XML I will not use XSLT to transform RDF/XML I will not use XSLT to transform RDF/XML I will not use XSLT to transform RDF/XML I will not use XSLT to transform RDF/XML I will not use XSLT to transform RDF/XML I will not use XSLT to transform RDF/XML
    8. 8. Related work TriX, RDFXSLT: “alternative” RDF syntaxes in XML RDF Twig, TreeHugger: add intelligence to XSLT processors Topia: XSLT functions to query Sesame using pre-SPARQL languages XSPARQL: unification of SPARQL and XQuery in a single language 6
    9. 9. Scripting RDF transformations Many scripting languages have RDF APIs, but they’re not standard Coding transformations in script languages leads is messy (e.g.: Vapour) 7
    10. 10. XSLT+SPARQL Two sets of XPath functions to query RDF models using SPARQL Intended to be used in @select expressions of XSLT stylesheets Return plain, standard XML easily tractable with XSLT 8
    11. 11. Query results XML syntax <results> <result> <binding name=quot;fnamequot;><value>Tom</value>/binding> <binding name=quot;emailquot;><value>tom@example.org</value></binding> </result> <result> <binding name=quot;fnamequot;><value>Dick</value></binding> <binding name=quot;emailquot;><value>dick@example.org</value></binding> </result> </results> W3C Recommendation (Jan 2008) 9
    12. 12. Basic functions sparql:sparql(query [, documentUrl, ...]) sparql:sparqlEndpoint(query, endpointUrl) Execute SPARQL queries locally or remotely (endpoint) Model can be extended with “FROM” clauses 10
    13. 13. Advanced functions (I) Retrieving and parsing RDF data is expensive Advanced functions designed for: ‣ Efficently query the same model multiple times ‣ Create custom models programatically 11
    14. 14. Advanced functions (II) sparql’:parseString(string, syntax) Creates a model by parsing a string containg serialized RDF Support for different syntaxes: RDF/ XML, N3, TriX... 12
    15. 15. Advanced functions (III) sparql’:readModel(documentUrl, syntax) sparql’:readModel(nodeset) Create models by parsing: 1. A document retrieved from the web 2. An XML subtree (from the XSLT input document or even from the XSLT itself) 13
    16. 16. Advanced functions (IV) sparql’:mergeModels(model1, model2, ...) Combines two or more RDF models into a new one 14
    17. 17. Advanced functions (V) sparql’:sparqlModel(query, model) Executes a SPARQL query against an in-memory model 15
    18. 18. Two implementations 1. Java-based extension to Apache Xalan using Jena Exploits the extensibility of XSLT 2. Pure XSLT implementation Incomplete, XSLT “document()” cannot perform conneg 16
    19. 19. Applications of XSLT+SPARQL Transform RDF data for presentation in XHTML, SVG... Generate reports beyond SPARQL capabilities Develop scripts that retrieve and operate with RDF data from the (semantic) web 17
    20. 20. Example: Ivan’s acquittances <xsl:template match=quot;/quot;> <xsl:apply-templates select=quot;sparql:sparql(concat(sparql:commonPrefixes(), 'SELECT ?name ?mbox_sha1sum ?homepage FROM &lt;http://www.ivan-herman.net/foaf.rdf&gt; WHERE { &lt;http://www.ivan-herman.net/Ivan_Herman&gt; foaf:knows ?x . ?x foaf:name ?name . OPTIONAL { ?x foaf:mbox_sha1sum ?mbox_sha1sum . ?x foaf:homepage ?homepage }}'))quot;/> </xsl:template> <xsl:template match=quot;results:resultsquot;> <html> ... </html> </xsl:template> 18
    21. 21. Advanced examples Querying DBPedia endpoint HTML displays of SKOS thesauri Alphabetic display Systematic display (tree-like) “Spider” agent for LoD 19
    22. 22. Conclusions and future work XSLT+SPARQL overcomes the limitations of XSLT to process RDF XSLT+SPARQL can be used to write declarative scripts for the semantic web Future extensions: inference support 20
    23. 23. Thank you for your attention diego.berrueta@fundacionctic.org

    ×