A RDF Metadata Model for OpenDocument  Format 1.2 Svante Schubert Software Engineer Sun Microsystems Inc.
About the Speaker Since 1999 working for Sun Microsystems on StarOffice Since 2006 co-lead of OpenOffice XML project Responsible for the XML based filters Co-editor of the OASIS Metadata Specification
Agenda Metadata basics / existing standards Metadata model of ODF 1.2 Metadata support in OpenOffice.org 3
Metadata Basics “ The solution to the overabundance of information is more information.” David Weinberger Everything Is Miscellaneous: The Power of the New Digital Disorder
Metadata Basics What is metadata? “Metadata is data about data” 1) Why do I need metadata? Enhanced search Workflow Accessibility Citation Bridge the semantic gap (e.g. zip vs. post code, native languages) ... 1) http://en.wikipedia.org/wiki/Metadata
Metadata Basics More Precise Metadata Definition What is metadata? Labels to identify/categorize your data Related data Why do I need metadata? Metadata makes your data interpretable by other applications
Metadata Basics Extending current ODF Metadata Support Why a new metadata model? Current ODF metadata related to document Not extensible Content tagging by styles is not enough! Styles are not descriptive Styles are not interchangeable with other applications
Metadata Basics The Idea of the Semantic Web What is the Semantic Web / Data Web? A web, where software can find/combine/share information more easily Requirements of Semantic Web / Data Web Data annotated in a common way using metadata Web applications acting upon standardized metadata
Existing Metadata Standards Resource Description Framework (RDF) RDF/XML is W3C Recommendation (2004) Resources Unique identification by IRI Described by RDF statements RDF Statements -  Triple: subject + predicate + object uri://sun/employees/svante foaf:name “Svante Schubert” .
Existing Metadata Standards RDF Graphs... Based on http://www.w3.org/2000/Talks/1206-xml2k-tbl/slide6-0.html
Existing Metadata Standards ...superimpose Based on http://www.w3.org/2000/Talks/1206-xml2k-tbl/slide7-0.html
Existing Metadata Standards Web Ontology Language  (OWL) Ontology (from Greek) Onto (being) Logia (written/spoken discourse) Description of entities/concepts and their relations Like OO language using classes, properties, etc. ODF 1.2 includes OWL package description OWL Classes of pkg:Package, pkg:File, odf:Element
Metadata Model of ODF 1.2 Metadata Files in the Package RDF file e.g. “/meta/data.rdf” Content file e.g. “/styles.xml” Meta data manifest “ /manifest.rdf” RDF file e.g. “/meta/cita.rdf” Content file e.g. “/content.xml”
Metadata Model of ODF 1.2 Metadata Files in the Package Content files (e.g. content.xml, styles.xml) About 50 ODF elements with xml:id attribute  Metadata manifest (manifest.rdf) Heart of metadata model Mapping from content's xml:id to RDF IRIs User RDF/XML files Metadata file possibly from an office extension
Metadata Model of ODF 1.2 Metadata Files in the Package <table:table xml:id=” someID ”> ... Hospital Doctor Duty List ... </table:table> RDF files e.g.“/meta/data.rdf” <odf:ContentFile pkg:path=&quot; content.xml &quot;> <pkg:hasPart> <odf:Element rdf:about=&quot; uri:someIRI &quot; pkg:idref=&quot; someID &quot;/> Content files e.g. “/content.xml” <odf:Element rdf:about=&quot; uri:someIRI &quot;> <ex:workingHoursOf> <med:Doctors rdf:about=&quot; http://hospital-DB/doctors/ID116 &quot;> <med:fieldName xml:lang=&quot;en&quot;>Neurologist</med:fieldN. Meta data manifest “ /manifest.rdf”
Metadata Model of ODF 1.2 In Content Metadata If metadata is equal text/visual data Reason: No data duplication Used by ODF 5 elements: Bookmark start - <text:bookmark-start> Heading - <text:h> Metadata text - <text:meta> Paragraph - <text:p> Table cell - <table:table-cell>
Metadata Model of ODF 1.2 In Content Metadata <text:p>The doctor's name was <text:meta  m:about=” http://hospital-DB/doctors/ID116 ”      m:property=” http://xmlns.com/foaf/0.1/name ”> Dr. J. Franklin</text:meta> RDF files e.g.“/meta/data.rdf” Content files e.g. “/content.xml” <med:Doctor rdf:about=&quot; http://hospital-DB/doctors/ID116 &quot;> <med:hasPatient> <med:Patient rdf:about=”http://hospital-DB/patients/IDA1”>
Metadata Model of ODF 1.2 Metadata Text Field ODF field “text:meta-field” based on metadata Appears within paragraph Holds any paragraph content Citation example: “According to [2]” <text:p>According to  <text:meta-field  xml:id=” someID ”> <text:style text:style-name=”s1”>[2]
Existing Metadata Standards Semantic Web Architecture Based on http://www.w3.org/2000/Talks/1206-xml2k-tbl/slide10-0.html
Existing Metadata Standards ODF in the Semantic Web Based on http://www.w3.org/2000/Talks/1206-xml2k-tbl/slide10-0.html
Metadata support in OOo 3 Support of metadata in the ODF 1.2 package API for metadata extension developers Providing UNO APIs to access metadata Wrapping existing opensource tools Some possible choices: Jena, Sesame, librdf, RDF Twig Possibility of a generic metadata extension Import / Create your own RDF vocabulary Relate vocabulary to ODF content by GUI
More Information Contact me here at OOoCon 2007 Download latest docs:  http://www.oasis-open.org/committees/ documents.php?wg_abbrev=office-metadata RDF N3 Tutorial: http://www.w3.org/2000/10/swap/Primer  RDF Converter: http://www.mindswap.org/2002/rdfconvert/ http://blogs.sun.com/GullFOSS
Thank you –  Questions & Answers Svante Schubert [email_address]

The OpenOffice.org ODF Toolkit Project

  • 1.
    A RDF MetadataModel for OpenDocument Format 1.2 Svante Schubert Software Engineer Sun Microsystems Inc.
  • 2.
    About the SpeakerSince 1999 working for Sun Microsystems on StarOffice Since 2006 co-lead of OpenOffice XML project Responsible for the XML based filters Co-editor of the OASIS Metadata Specification
  • 3.
    Agenda Metadata basics/ existing standards Metadata model of ODF 1.2 Metadata support in OpenOffice.org 3
  • 4.
    Metadata Basics “The solution to the overabundance of information is more information.” David Weinberger Everything Is Miscellaneous: The Power of the New Digital Disorder
  • 5.
    Metadata Basics Whatis metadata? “Metadata is data about data” 1) Why do I need metadata? Enhanced search Workflow Accessibility Citation Bridge the semantic gap (e.g. zip vs. post code, native languages) ... 1) http://en.wikipedia.org/wiki/Metadata
  • 6.
    Metadata Basics MorePrecise Metadata Definition What is metadata? Labels to identify/categorize your data Related data Why do I need metadata? Metadata makes your data interpretable by other applications
  • 7.
    Metadata Basics Extendingcurrent ODF Metadata Support Why a new metadata model? Current ODF metadata related to document Not extensible Content tagging by styles is not enough! Styles are not descriptive Styles are not interchangeable with other applications
  • 8.
    Metadata Basics TheIdea of the Semantic Web What is the Semantic Web / Data Web? A web, where software can find/combine/share information more easily Requirements of Semantic Web / Data Web Data annotated in a common way using metadata Web applications acting upon standardized metadata
  • 9.
    Existing Metadata StandardsResource Description Framework (RDF) RDF/XML is W3C Recommendation (2004) Resources Unique identification by IRI Described by RDF statements RDF Statements - Triple: subject + predicate + object uri://sun/employees/svante foaf:name “Svante Schubert” .
  • 10.
    Existing Metadata StandardsRDF Graphs... Based on http://www.w3.org/2000/Talks/1206-xml2k-tbl/slide6-0.html
  • 11.
    Existing Metadata Standards...superimpose Based on http://www.w3.org/2000/Talks/1206-xml2k-tbl/slide7-0.html
  • 12.
    Existing Metadata StandardsWeb Ontology Language (OWL) Ontology (from Greek) Onto (being) Logia (written/spoken discourse) Description of entities/concepts and their relations Like OO language using classes, properties, etc. ODF 1.2 includes OWL package description OWL Classes of pkg:Package, pkg:File, odf:Element
  • 13.
    Metadata Model ofODF 1.2 Metadata Files in the Package RDF file e.g. “/meta/data.rdf” Content file e.g. “/styles.xml” Meta data manifest “ /manifest.rdf” RDF file e.g. “/meta/cita.rdf” Content file e.g. “/content.xml”
  • 14.
    Metadata Model ofODF 1.2 Metadata Files in the Package Content files (e.g. content.xml, styles.xml) About 50 ODF elements with xml:id attribute Metadata manifest (manifest.rdf) Heart of metadata model Mapping from content's xml:id to RDF IRIs User RDF/XML files Metadata file possibly from an office extension
  • 15.
    Metadata Model ofODF 1.2 Metadata Files in the Package <table:table xml:id=” someID ”> ... Hospital Doctor Duty List ... </table:table> RDF files e.g.“/meta/data.rdf” <odf:ContentFile pkg:path=&quot; content.xml &quot;> <pkg:hasPart> <odf:Element rdf:about=&quot; uri:someIRI &quot; pkg:idref=&quot; someID &quot;/> Content files e.g. “/content.xml” <odf:Element rdf:about=&quot; uri:someIRI &quot;> <ex:workingHoursOf> <med:Doctors rdf:about=&quot; http://hospital-DB/doctors/ID116 &quot;> <med:fieldName xml:lang=&quot;en&quot;>Neurologist</med:fieldN. Meta data manifest “ /manifest.rdf”
  • 16.
    Metadata Model ofODF 1.2 In Content Metadata If metadata is equal text/visual data Reason: No data duplication Used by ODF 5 elements: Bookmark start - <text:bookmark-start> Heading - <text:h> Metadata text - <text:meta> Paragraph - <text:p> Table cell - <table:table-cell>
  • 17.
    Metadata Model ofODF 1.2 In Content Metadata <text:p>The doctor's name was <text:meta m:about=” http://hospital-DB/doctors/ID116 ” m:property=” http://xmlns.com/foaf/0.1/name ”> Dr. J. Franklin</text:meta> RDF files e.g.“/meta/data.rdf” Content files e.g. “/content.xml” <med:Doctor rdf:about=&quot; http://hospital-DB/doctors/ID116 &quot;> <med:hasPatient> <med:Patient rdf:about=”http://hospital-DB/patients/IDA1”>
  • 18.
    Metadata Model ofODF 1.2 Metadata Text Field ODF field “text:meta-field” based on metadata Appears within paragraph Holds any paragraph content Citation example: “According to [2]” <text:p>According to <text:meta-field xml:id=” someID ”> <text:style text:style-name=”s1”>[2]
  • 19.
    Existing Metadata StandardsSemantic Web Architecture Based on http://www.w3.org/2000/Talks/1206-xml2k-tbl/slide10-0.html
  • 20.
    Existing Metadata StandardsODF in the Semantic Web Based on http://www.w3.org/2000/Talks/1206-xml2k-tbl/slide10-0.html
  • 21.
    Metadata support inOOo 3 Support of metadata in the ODF 1.2 package API for metadata extension developers Providing UNO APIs to access metadata Wrapping existing opensource tools Some possible choices: Jena, Sesame, librdf, RDF Twig Possibility of a generic metadata extension Import / Create your own RDF vocabulary Relate vocabulary to ODF content by GUI
  • 22.
    More Information Contactme here at OOoCon 2007 Download latest docs: http://www.oasis-open.org/committees/ documents.php?wg_abbrev=office-metadata RDF N3 Tutorial: http://www.w3.org/2000/10/swap/Primer RDF Converter: http://www.mindswap.org/2002/rdfconvert/ http://blogs.sun.com/GullFOSS
  • 23.
    Thank you – Questions & Answers Svante Schubert [email_address]