Krextor – An Extensible XML→RDF Extraction Framework

826 views

Published on

Workshop Scripting for the Semantic Web, ESWC 2009

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
826
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
7
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Krextor – An Extensible XML→RDF Extraction Framework

  1. 1. Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion Krextor – An Extensible XML→RDF Extraction Framework Scripting for the Semantic Web, 5th Workshop Christoph Lange Jacobs University, Bremen, Germany KWARC – Knowledge Adaptation and Reasoning for Content May 31, 2009 Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 1/15
  2. 2. Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion Overview Want XML applications to contribute to the Semantic Web? 1 Define a schema→ontology mapping for your XML language 2 Extract RDF from XML Krextor: Specify XML→ontology mappings (as extraction rules) Perform extraction (XSLT-based implementation) http://kwarc.info/projects/krextor/ Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 2/15
  3. 3. Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion XML vs. RDF Two slices of the infamous Layer Cake: RDF XML Doesn’t tell much about the role of XML: 1 XML only for encoding higher-layer formalisms like RDF or OWL? 2 or XML as a metalanguage of its own right? In case (2), we need a semantics for XML-based languages! Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 3/15
  4. 4. Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion XML languages Advantages of using XML for knowledge representation (and not just RDF): 1 Sequential order out of the box 2 Style languages (CSS, XSL) Given any domain, . . . can define an XML schema for a domain-specific language concise syntax for domain experts no need to think in triples (compare OWL XML vs. RDF/XML) Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 4/15
  5. 5. Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion What about the semantics? <workshop xml:id="SFSW09" conference="#ESWC09" number="5" date="2009-05-31"> <title short="SFSW">Scripting for the Semantic Web</title> </workshop> Usual approach: human-readable specification, then hard-code Semantic approaches: RDFa, Microformats Open questions: 1 How to give above language a direct RDF-based semantics? 2 How to implement the XML→RDF translation? Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 5/15
  6. 6. Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion Making an XML language semantic We are focused on practical implementation, not on a formal semantics bridging XML and RDF. We want to benefit from existing XML and RDF tools. Our approach: 1 provide rules that translate XML to RDF 2 if needed, supply an ontology as vocabulary for the extracted RDF Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 6/15
  7. 7. Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion Krextor’s History 1 Origin: OMDoc (Open Mathematical Documents; XML schema and ontology) manage in a semantic wiki 2 Hard-coded Java implementation: too unflexible to maintain 3 More lightweight approach: XSLT coded from scratch (OMDoc→RXR→Java) 4 Needed support for other languages http://kwarc.info/ 5 Created Krextor, a generic XSLT-based projects/krextor/ framework 6 . . . and provided some more translations (‘‘extraction modules’’) Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 7/15
  8. 8. Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion The Framework OMDoc +RDFa RDF/XML OMDoc/OWL +RDFa XHTML RXR Turtle +RDFa ? generic ? OpenMath representation your format my XML +RDFa? Java my Microformat callback input format output format Collection of XSLT stylesheets, Java wrapper, Shell frontend Output targetted at machines, not humans Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 8/15
  9. 9. Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion Adding Input and Output Modules Input module (for a new XML language): very simple declarative mappings (element class) otherwise pattern-match XML structure, then call a predefined template: create resource, add property, etc. several ways of generating URIs for XML elements: xml:id, auto-generated, custom Output module (for a new RDF serialization): implement low-level ‘‘triple generation template’’ or post-process output of an existing module Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 9/15
  10. 10. Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion Our own applications Semantic wiki: SWiM semantic wiki (http://swim.kwarc.info) mathematical documents (OMDoc, OpenMath) extract RDF outline from documents use it for navigation, querying, problem-solving assistance Documented ontologies: write ontologies in OMDoc (better documentability → poster session) Krextor translates to OWL Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 10/15
  11. 11. Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion Example: hCalendar Microformat (1) Input: <div class="vevent"> <a class="url" href="http://www.eswc2009.org">ESWC</a> starts on <span class="dtstart">2009-05-31</span>.</div> Desired output: <http://www.eswc2009.org> a <http://www.w3.org/2002/12/cal/ical#Vevent> ; <http://www.w3.org/2002/12/cal/ical#dtstart> "2009-05-31"^^<http://www.w3.org/2001/XMLSchema#date> Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 11/15
  12. 12. Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion Example: hCalendar Microformat (2) Usage: krextor hcalendar..turtle infile.xhtml Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 12/15
  13. 13. Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion Example: Declarative Mapping (OpenMath) <xsl:variable name="krextor:resources"> <CD type="&omo;ContentDictionary"/> Resources <CDDefinition type="&omo;SymbolDefinition" related-via-properties="&omo;containsSymbolDefinition"/> <Example type="&omo;Example" related-via-properties="&omo;hasExample"/> </xsl:variable> <xsl:template match="CD|CDDefinition|Example" <xsl:apply-templates select="." mode="krextor:create-resource"/> </xsl:template> <xsl:variable name="krextor:literal-properties"> <Name property="&dc;identifier" normalize-space="true"/> <Description property="&dc;description" normalize-space="true"/> Properties <Title property="&dc;title" normalize-space="true"/> <Role property="&omo;role" normalize-space="true"/> </xsl:variable> <xsl:template match="Name|Description|Title|Role"> <xsl:apply-templates select="." mode="krextor:add-literal-property"/> </xsl:template> Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 13/15
  14. 14. Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion Related Work Swignition: extensive support for ‘‘standard’’ semantics (RDFa, microformats, GRDDL), but harder to add a new input language XSDL: declarative XML→OWL-DL mapping. Not (?) implemented; would make a nice frontend to Krextor XSPARQL: combines SPARQL and XQuery, breaks boundaries between XML and RDF. Currently rather one-time queries than complete translations. Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 14/15
  15. 15. Sem. Markup and RDF Krextor Framework Applications Examples Related Conclusion Conclusion Krextor supports many XML→RDF conversion tasks Easy to extend, easy to integrate into applications Possible integration into engineering workflows: Ontology engineering: First design the ontology, then a convenient XML syntax for domain-specific knowledge Language engineering: Specify the semantics while engineering the schema Ch. Lange (Jacobs University) Krextor – An Extensible XML→RDF Extraction Framework May 31, 2009 15/15

×