CAA 2010




A Framework for transforming archaeological
     databases to ontological datasets

                              Monika Solanki
                                 m.solanki@mcs.le.ac.uk

                   Department of Computer Science
                            Joint work with
                               Yi Hong
                    Department of Computer Science
               Lin Foxhall, Alessandro Quercia
           School of Archaeology and Ancient History
                      University of Leicester, UK


  Monika Solanki      A Framework for transforming archaeological databases to ontological datasets
CAA 2010


From RDBMs to Ontological datasets




     Monika Solanki   A Framework for transforming archaeological databases to ontological datasets
CAA 2010


Tracing Networks

     Archaeologists study a wide range of material objects.
     By tracking them at every stage of their production,
     distribution, use, and consumption across a large
     geographical region, over a long time period, they can
     trace the links between the people who made, used, and
     taught others to make them.




      Monika Solanki   A Framework for transforming archaeological databases to ontological datasets
CAA 2010


Tracing Networks

     Archaeologists study a wide range of material objects.
     By tracking them at every stage of their production,
     distribution, use, and consumption across a large
     geographical region, over a long time period, they can
     trace the links between the people who made, used, and
     taught others to make them.

  Pertinent Questions
     How have individuals or groups of individuals learnt how to
     organise themselves?
     Why did some prosper while others collapsed?
     What are the dynamics of power, influence and the
     exchange of knowledge?
     In what kinds of contexts does innovation appear?
       Monika Solanki   A Framework for transforming archaeological databases to ontological datasets
CAA 2010


Tracing Networks




                  www.tracingnetworks.ac.uk

     Monika Solanki   A Framework for transforming archaeological databases to ontological datasets
CAA 2010


Tracing Networks: The Semantic web perspective

  Build the links through their datasets




        Monika Solanki
                     www.tracingnetworks.ac.uk to ontological datasets
                        A Framework for transforming archaeological databases
CAA 2010


Tracing Networks: Loomweights

  Example dataset for this talk: Loomweight dataset




       Monika Solanki   A Framework for transforming archaeological databases to ontological datasets
CAA 2010


Motivation

  Conventional mapping frameworks
      provide scripting languages to facilitate the mapping.
      apply simplistic mapping rules.




       Monika Solanki   A Framework for transforming archaeological databases to ontological datasets
CAA 2010


Motivation

  Realistic scenarios:
      the association between columns and properties is far
      more complex than a simple one-to-one correspondence.
      domain specific schemas to be used for mapping have
      been extended from standard vocabularies or those used
      elsewhere.
      Loomweights: the ontological instances conform to a
      domain specific schema, e.g., CIDOC-CRM.
      several ontology schemas are used and the data needs to
      be suitably mapped to more than one property.




        Monika Solanki   A Framework for transforming archaeological databases to ontological datasets
CAA 2010


The Loomweights Dataset

                                                             The column diameter in the
                                                             RDB table cannot be mapped
                                                             as a datatype property.
                                                             To specify a relationship
                                                             between diameter and the
                                                             concept Loomweight, create
                                                             intermediate instances of
                                                             CIDOC-CRM concepts.
                                                             Instances to be contextually
                                                             related to each other to
                                                             ensure loomweights are
                                                             assigned correct diameter
                                                             values.



     Monika Solanki   A Framework for transforming archaeological databases to ontological datasets
CAA 2010


Transformation Framework


                                                      ORM Reverse Engineering.
                                                      ECA Rule-based Transformation.
                                                      Ontology Instance Generation.




     Monika Solanki   A Framework for transforming archaeological databases to ontological datasets
CAA 2010


ORM Reverse Engineering

    Is used to extract the existing tables, columns, relationships
    (inc. primary key, foreign key, join etc) and index from a
    RDBMS to object-oriented data structures or “classes”.
    Records in the table can be instantiated as data objects
    which can be easily manipulated and processed using
    OOP techniques.
    In our proposed approach, the Hibernate ORM Reverse
    Engineering tool is used to convert database records into
    Java objects.

     ORM Reverse Engineering → ECA Rule-based Transformation → Ontology Instance Generation




      Monika Solanki      A Framework for transforming archaeological databases to ontological datasets
CAA 2010


ECA Rule-based Transformation


    An ECA-based (Event-Condition-Action) textual
    transformation language DOTL − Database Ontology
    Transformation Language.
    The fundamental construct of a DOTL transformation rule
    is of form:
                On Event if Condition Do Action

     ORM Reverse Engineering → ECA Rule-based Transformation → Ontology Instance Generation




      Monika Solanki      A Framework for transforming archaeological databases to ontological datasets
CAA 2010


ECA Rule-based Transformation: DOTL


  A basic DOTL rule consists of three parts:
      The event part specifies the triggers of the transformation
      rule,
      The condition part is a logical expression, which checks
      the pre-condition of the action to be carried out. The
      default conditions is “if undefined”.
      The action part usually consists of a series of creation of
      new ontology instances, properties and other
      corresponding operations.

       ORM Reverse Engineering → ECA Rule-based Transformation → Ontology Instance Generation




        Monika Solanki      A Framework for transforming archaeological databases to ontological datasets
CAA 2010


DOTL rules: Loomweights Database




     ORM Reverse Engineering → ECA Rule-based Transformation → Ontology Instance Generation


      Monika Solanki      A Framework for transforming archaeological databases to ontological datasets
CAA 2010


Ontology Instance Generation

     The Transfomer builder component is used to build Java
     source code based on the pre-defined DOTL rules.
     Finally the Transformer Engine component compiles and
     executes the Java code to generate RDF/OWL instances.
     The framework exports all data to the RDF store or as
     persistent RDF files.

     ORM Reverse Engineering → ECA Rule-based Transformation → Ontology Instance Generation




      Monika Solanki      A Framework for transforming archaeological databases to ontological datasets
CAA 2010


Prototype

     Java implementation.
     The open source Hibernate Reverse Engineering
     framework for object/relational mapping.
     DOTL Editor plugin for Eclipse: contains an integrated Java
     code generator implemented in Xpand.
     Formalised EBNF grammar of DOTL defined in Xtext.
     Metamodel of the language is described using the EMF
     (Eclipse Modeling Framework).
     Protégé-OWL API for the generation of RDF/OWL
     instances.




      Monika Solanki   A Framework for transforming archaeological databases to ontological datasets
CAA 2010


Implementation layers




      Monika Solanki   A Framework for transforming archaeological databases to ontological datasets
CAA 2010


Closely related work



     D2RQ, Virtuoso: a declarative language to describe
     mappings between relational database schemata and
     OWL/RDFS ontologies.
     R2O: XML-based language for the transformation.
     STAR
     TRANSLATION




      Monika Solanki   A Framework for transforming archaeological databases to ontological datasets
CAA 2010


Summary

    A transformation framework for migrating large volumes of
    archaeological data stored in RDBs to ontology based data
    sets on the Semantic Web.
    The ECA-based scripting language DOTL, which allows
    the specification of complex transformation rules from data
    objects to ontologies.
    A motivating example of the loomweights datasets based
    on the CIDOC-CRM ontology schema as a case study.
    A prototype implementation that illustrates our
    methodology.




     Monika Solanki   A Framework for transforming archaeological databases to ontological datasets
CAA 2010


Future work

     Refining the grammar and semantics to enhance the
     expressiveness of DOTL to improve the usability of the
     system.
     Implement a user-friendly graphical modeling environment
     for the language in GMF (Graphical Modeling Framework)
     to allow easy creation and editing of transformation rules.
     Expose the datasets as LOD.
     Semantic search and reasoning techniques on the LODs
     as methodologies to “trace” the links between the artifacts.




      Monika Solanki   A Framework for transforming archaeological databases to ontological datasets
CAA 2010


The tracing network LOD cloud




      Monika Solanki   A Framework for transforming archaeological databases to ontological datasets
CAA 2010




                            Thanks!!!




paper at:
http://www.tracingnetworks.ac.uk/publications/CAA2010/paper.pdf
slides at:
http://www.tracingnetworks.ac.uk/publications/CAA2010/slides.pdf



       Monika Solanki   A Framework for transforming archaeological databases to ontological datasets

A Framework for transforming archaeological databases to ontological datasets

  • 1.
    CAA 2010 A Frameworkfor transforming archaeological databases to ontological datasets Monika Solanki m.solanki@mcs.le.ac.uk Department of Computer Science Joint work with Yi Hong Department of Computer Science Lin Foxhall, Alessandro Quercia School of Archaeology and Ancient History University of Leicester, UK Monika Solanki A Framework for transforming archaeological databases to ontological datasets
  • 2.
    CAA 2010 From RDBMsto Ontological datasets Monika Solanki A Framework for transforming archaeological databases to ontological datasets
  • 3.
    CAA 2010 Tracing Networks Archaeologists study a wide range of material objects. By tracking them at every stage of their production, distribution, use, and consumption across a large geographical region, over a long time period, they can trace the links between the people who made, used, and taught others to make them. Monika Solanki A Framework for transforming archaeological databases to ontological datasets
  • 4.
    CAA 2010 Tracing Networks Archaeologists study a wide range of material objects. By tracking them at every stage of their production, distribution, use, and consumption across a large geographical region, over a long time period, they can trace the links between the people who made, used, and taught others to make them. Pertinent Questions How have individuals or groups of individuals learnt how to organise themselves? Why did some prosper while others collapsed? What are the dynamics of power, influence and the exchange of knowledge? In what kinds of contexts does innovation appear? Monika Solanki A Framework for transforming archaeological databases to ontological datasets
  • 5.
    CAA 2010 Tracing Networks www.tracingnetworks.ac.uk Monika Solanki A Framework for transforming archaeological databases to ontological datasets
  • 6.
    CAA 2010 Tracing Networks:The Semantic web perspective Build the links through their datasets Monika Solanki www.tracingnetworks.ac.uk to ontological datasets A Framework for transforming archaeological databases
  • 7.
    CAA 2010 Tracing Networks:Loomweights Example dataset for this talk: Loomweight dataset Monika Solanki A Framework for transforming archaeological databases to ontological datasets
  • 8.
    CAA 2010 Motivation Conventional mapping frameworks provide scripting languages to facilitate the mapping. apply simplistic mapping rules. Monika Solanki A Framework for transforming archaeological databases to ontological datasets
  • 9.
    CAA 2010 Motivation Realistic scenarios: the association between columns and properties is far more complex than a simple one-to-one correspondence. domain specific schemas to be used for mapping have been extended from standard vocabularies or those used elsewhere. Loomweights: the ontological instances conform to a domain specific schema, e.g., CIDOC-CRM. several ontology schemas are used and the data needs to be suitably mapped to more than one property. Monika Solanki A Framework for transforming archaeological databases to ontological datasets
  • 10.
    CAA 2010 The LoomweightsDataset The column diameter in the RDB table cannot be mapped as a datatype property. To specify a relationship between diameter and the concept Loomweight, create intermediate instances of CIDOC-CRM concepts. Instances to be contextually related to each other to ensure loomweights are assigned correct diameter values. Monika Solanki A Framework for transforming archaeological databases to ontological datasets
  • 11.
    CAA 2010 Transformation Framework ORM Reverse Engineering. ECA Rule-based Transformation. Ontology Instance Generation. Monika Solanki A Framework for transforming archaeological databases to ontological datasets
  • 12.
    CAA 2010 ORM ReverseEngineering Is used to extract the existing tables, columns, relationships (inc. primary key, foreign key, join etc) and index from a RDBMS to object-oriented data structures or “classes”. Records in the table can be instantiated as data objects which can be easily manipulated and processed using OOP techniques. In our proposed approach, the Hibernate ORM Reverse Engineering tool is used to convert database records into Java objects. ORM Reverse Engineering → ECA Rule-based Transformation → Ontology Instance Generation Monika Solanki A Framework for transforming archaeological databases to ontological datasets
  • 13.
    CAA 2010 ECA Rule-basedTransformation An ECA-based (Event-Condition-Action) textual transformation language DOTL − Database Ontology Transformation Language. The fundamental construct of a DOTL transformation rule is of form: On Event if Condition Do Action ORM Reverse Engineering → ECA Rule-based Transformation → Ontology Instance Generation Monika Solanki A Framework for transforming archaeological databases to ontological datasets
  • 14.
    CAA 2010 ECA Rule-basedTransformation: DOTL A basic DOTL rule consists of three parts: The event part specifies the triggers of the transformation rule, The condition part is a logical expression, which checks the pre-condition of the action to be carried out. The default conditions is “if undefined”. The action part usually consists of a series of creation of new ontology instances, properties and other corresponding operations. ORM Reverse Engineering → ECA Rule-based Transformation → Ontology Instance Generation Monika Solanki A Framework for transforming archaeological databases to ontological datasets
  • 15.
    CAA 2010 DOTL rules:Loomweights Database ORM Reverse Engineering → ECA Rule-based Transformation → Ontology Instance Generation Monika Solanki A Framework for transforming archaeological databases to ontological datasets
  • 16.
    CAA 2010 Ontology InstanceGeneration The Transfomer builder component is used to build Java source code based on the pre-defined DOTL rules. Finally the Transformer Engine component compiles and executes the Java code to generate RDF/OWL instances. The framework exports all data to the RDF store or as persistent RDF files. ORM Reverse Engineering → ECA Rule-based Transformation → Ontology Instance Generation Monika Solanki A Framework for transforming archaeological databases to ontological datasets
  • 17.
    CAA 2010 Prototype Java implementation. The open source Hibernate Reverse Engineering framework for object/relational mapping. DOTL Editor plugin for Eclipse: contains an integrated Java code generator implemented in Xpand. Formalised EBNF grammar of DOTL defined in Xtext. Metamodel of the language is described using the EMF (Eclipse Modeling Framework). Protégé-OWL API for the generation of RDF/OWL instances. Monika Solanki A Framework for transforming archaeological databases to ontological datasets
  • 18.
    CAA 2010 Implementation layers Monika Solanki A Framework for transforming archaeological databases to ontological datasets
  • 19.
    CAA 2010 Closely relatedwork D2RQ, Virtuoso: a declarative language to describe mappings between relational database schemata and OWL/RDFS ontologies. R2O: XML-based language for the transformation. STAR TRANSLATION Monika Solanki A Framework for transforming archaeological databases to ontological datasets
  • 20.
    CAA 2010 Summary A transformation framework for migrating large volumes of archaeological data stored in RDBs to ontology based data sets on the Semantic Web. The ECA-based scripting language DOTL, which allows the specification of complex transformation rules from data objects to ontologies. A motivating example of the loomweights datasets based on the CIDOC-CRM ontology schema as a case study. A prototype implementation that illustrates our methodology. Monika Solanki A Framework for transforming archaeological databases to ontological datasets
  • 21.
    CAA 2010 Future work Refining the grammar and semantics to enhance the expressiveness of DOTL to improve the usability of the system. Implement a user-friendly graphical modeling environment for the language in GMF (Graphical Modeling Framework) to allow easy creation and editing of transformation rules. Expose the datasets as LOD. Semantic search and reasoning techniques on the LODs as methodologies to “trace” the links between the artifacts. Monika Solanki A Framework for transforming archaeological databases to ontological datasets
  • 22.
    CAA 2010 The tracingnetwork LOD cloud Monika Solanki A Framework for transforming archaeological databases to ontological datasets
  • 23.
    CAA 2010 Thanks!!! paper at: http://www.tracingnetworks.ac.uk/publications/CAA2010/paper.pdf slides at: http://www.tracingnetworks.ac.uk/publications/CAA2010/slides.pdf Monika Solanki A Framework for transforming archaeological databases to ontological datasets