To OO or not to OO? Revelations from defining an ontology for an archaeological information system


Published on

A presentation given by Keith May and me at CAA 2004 held in Prato, Italy. The topic was a sub-project which emerged from the English Heritage Revelation project; the Ontological Modelling project. This project looked at a range of existing data models, paper forms, databases and other source information and through discussions with domain specialists, created a representation of the information archaeologists use based on the CIDOC Conceptual Reference Model (CRM).

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • The Ontological Modeling Project is a CfA project that derives from the assessment stage of the Revelation project
  • The ontological modelling project developed from the original work on the Revelation project at the English Heritage Centre for Archaeology. Revelation carried out a review of existing systems and this painted a picture of a fragmented number of different parts of CfA each with different Information Systems that don’t speak to each other except by a series of ad hoc “routes”. Further Revelation work looking at sectoral practice suggested that this picture was by now means unique to CfA and that there would be value in trying to develop models of that could better express the relationships between archaeological data and processes at a conceptual level in addition to more standard data flow diagrams and entity modelling.
  • We needed to express the existing use of data within CfA in a way that could be understood at a general level by the users as representing how they went about their work on archaeological projects and how it related to others within the different specialist teams at CfA. There was also a requirement to try and model the current state of affairs, but in a way that would enable us to show how the data could be better structured in future for sharing and interoperability not just within the CfA but also for the wider organisation in EH and beyond to the archaeological sector and public. To do this it seemed that the use of an ontology for expressing not just the keywords in the data, but also the conceptual meanings behind the information held in various systems could provide a way to begin. The principle aims in adopting an ontological approach were: Shared understanding of information – building bridges between different data sets Encapsulating & re-using domain expertise – ability to share across different specialisms Enabling searching by non-domain experts – using semantic web compatible approaches
  • The CIDOC CRM has evolved from the world of museums documentation. It is only more recently becoming known in the archaeological world (as the number of papers in this years conference seems to testify), but there appeared to be a number of advantages to using the CRM for modelling archaeological recording systems.            First and possibly the biggest selling point is that the modelling approach is based on mapping the knowledge of the domain experts. Rather than setting some prescriptive set of standard terms that everyone had to use or else be incompatible, there was considerable appeal to archaeologists in an approach that only asked that existing data be mapped to a more conceptual model for it to be usable.            Defining conceptual processes that analyse the data without necessarily following simple data processing techniques (e.g. ability to model phasing and grouping)            Relating archaeological data to environmental, geological, agricultural domains.            Event based modelling of archaeological activities           Extensibility of the CRM to allow local extensions of the model while maintaining compatibility            Using the CRM for modelling gave the advantages of OO modelling without pre-determining a non- relational implementation * Using an existing ontology such as CRM should provide greater standardisation and interoperability with similar data sets
  • At its core the CRM consists of a few high level concepts. These few concepts have been extended by a larger set of specialisations that allow us to more fully describe the Heritage Domain. Some of the archaeological examples that map to the CRM are given below: Expand on archaeological examples of this as time allows: The core concepts are: E2 Temporal Entities - These are all things that happen in time. They include events like the creation of objects, their loss, deposition, discovery, interpretation and conservation. E18 Physical Stuff - These are persistent physical items with a relatively stable form both man-made and natural. Examples include pottery, coins, pollen and carbonised seeds. E28 Conceptual Objects - These are non-material products of our minds. They include things like project designs, reports, the mark used by a potter, text books, songs and military orders. E39 Actors - These are people, either individually or in groups, who have the potential to perform intentional actions for which they can be held responsible. Individuals include people like potters, archaeologists and scientists. Groups include project teams, archaeological societies, excavation units and English Heritage. E52 Time-Spans - These are temporal extents that have a beginning, an end and a duration. For example “The duration of the Catterick project” “The duration of the use of the X potters mark” E53 Places - These are mathematical extents in space. They are usually relative to the surface of the earth but can be relative to some other fixed body of matter (for example the bow of a ship is a place). For example “The total extent of the excavation in 1967” E41 Appellations - These are all proper names, words, phrases or codes that are used to identify something. For instance John Smith, J. Smith and Smith, John are all names. They are different from the person. Other examples include context 1456, Lyons, English Heritage and The Portland Vase. E55 Types - These are the classifications used to characterise something. For example Samian is a type of pottery. This is where all thesauri, word lists and controlled vocabulary fit into the CRM.
  • Finding a suitable methodology proved less straight-forward as the CRM does not actually include this. The overall approach adopted was derived from general examples of ontology building. However we were not building the ontology so much as mapping CfA information to it and defining methods for how we would actually actually use the CRM. The approach broadly was as follows: Acquire domain knowledge - Defined our domain as CfA information systems – interviews with domain experts 2. Organize the ontological model - This can be seen as two basic operations - identifying the global concepts (Classes) that best match the data being created - identifying the properties (roles & relationships between the classes) 3. Flesh out the ontological model - drawing the diagrams - text documentation of entities/classes and relationships 4. Check the work - re-iterate discussions and checking of diagrams with domain experts - circulation to domain experts 5. Commit the ontological model - final verification by CRM community - broaden usage as appropriate to wider archaeological community Partly because we found this an area that was less documented we felt writing this paper would help others looking for methodologies.
  • To OO or not to OO? Revelations from defining an ontology for an archaeological information system

    1. 1. To OO or not to OO? Revelations from defining an ontology for an archaeological information system English Heritage C entre for Archaeology (CfA) Paul Cripps & Keith May
    2. 3. Why use an ontology? <ul><li>Shared understanding of information </li></ul><ul><li>Encapsulating & re-using domain expertise </li></ul><ul><li>Enabling searching by non-domain experts </li></ul>
    3. 4. Advantages of the CRM <ul><li>Mapping the knowledge of the domain experts </li></ul><ul><li>Defining conceptual archaeological processes </li></ul><ul><li>Relating archaeology to other domains </li></ul><ul><li>Event based modelling of archaeological activities </li></ul><ul><li>existing ontology provides greater standardisation and interoperability </li></ul>
    4. 6. CRM modelling methodology <ul><li>Acquire domain knowledge </li></ul><ul><li>Organize the ontological model </li></ul><ul><li>Flesh out the ontological model </li></ul><ul><li>Check the work </li></ul><ul><li>Commit the ontological model </li></ul>
    5. 7. So what exactly are we modelling and why…? <ul><li>The Centre for Archaeology information domain </li></ul><ul><li>This includes archaeologists, geophysicists, scientific specialists, conservators, archivists, surveyors, buildings specialists, finds specialists, graphic artists </li></ul><ul><li>As the basis for improving our shared use of information, a conceptual framework for systems development </li></ul>
    6. 8. So why an OO approach to the modelling…? <ul><li>Flexible approach using UML to visualise, unlike other approaches </li></ul><ul><li>Event driven </li></ul><ul><li>IsA relationships easily understood, similar to traditional hierarchical classificatory approaches common in archaeology </li></ul><ul><li>Class inheritance </li></ul><ul><li>CRM uses an OO approach </li></ul>
    7. 9. The model in detail <ul><li>The OO approach focussed our minds on looking for patterns … </li></ul><ul><li>… and identifying gaps. </li></ul><ul><li>Use of stereotyping and class inheritance . </li></ul><ul><li>Also meta-entities. </li></ul>
    8. 10. Patterns <ul><li>Looking in detail for commonality… </li></ul><ul><li>… especially across teams. </li></ul><ul><li>Similar functions across teams eg those relating to fieldwork </li></ul><ul><li>Not easy to identify using other techniques. </li></ul>
    9. 11. Gaps <ul><li>Data and objects exclusively as the product of events </li></ul><ul><li>Thinking in terms of objects and events facilitates spotting missing objects or events. </li></ul><ul><li>e.g. A spot date for a context, based on stratigraphy, finds and known typologies, specialist assertion - many events building a web of information. </li></ul>context Spot date Specialist Assertion
    10. 12. Stereotyping, class inheritance and meta-entities <ul><li>Used during application of the CRM. </li></ul><ul><li>We can say the process of excavation is an Activity, but also a Creation Event and a Destruction Event, inheriting all properties from each superclass. </li></ul><ul><li>We can define a group of related entities as a meta-entity; for example, CfA Activities always involve identifiable Actors and have Time-spans. </li></ul>P14: carried out by (performed) P4: has time-span (is time-span of) P79: beginning is qualified by P80: end is qualified by E39: Actor E52:Time-span E62: String Timestamp E62: String Timestamp E7: Activity
    11. 13. Events and the archaeological process <ul><li>Events in the past result in remains in the present </li></ul><ul><li>Activities in the present engage with and investigate the remains of the past </li></ul><ul><li>Effectively two groups of events, one in the present, one in the past, related by the place in which they occur and the physical remains in that place. </li></ul>
    12. 14. Events in the present <ul><li>Excavation </li></ul><ul><li>Drawing and photography </li></ul><ul><li>Survey </li></ul><ul><li>Sampling </li></ul><ul><li>Treatments and processing </li></ul><ul><li>Classification and grouping, including phasing </li></ul><ul><li>Measuring, including scientific dating </li></ul><ul><li>Recording of observations </li></ul><ul><li>Dissemination </li></ul><ul><li>Interpretation/Analysis can be seen as abstract classes for stereotyping ie any activity can implement analysis/interpretation </li></ul>
    13. 15. Events in the present <ul><li>e.g. The Measure Find activity </li></ul>P40: observed dimension (was observed in) P91: has unit (is unit of) P90: has value P14: carried out by (performed) P14.1: in the role of P39: measured (was measured by) P2: has type (is type of) <<E19: Physical Object>> Find <<E16: Measurement Event>> Measure Find measurements: length, width, diameter, weight, etc E55: Type E54: Dimension E58: Measurement Unit E60: Number E39: Actor E55: Type
    14. 16. Events in the past <ul><li>Context formation and depositional events (stratigraphy) </li></ul><ul><li>Geochemical, geological, environmental and biological processes </li></ul><ul><li>Object production and loss (finds deposition) </li></ul><ul><li>Construction, modification and destruction events relating to features and structures </li></ul><ul><li>Events occur at places; spatial operators for reasoning about spatial relationships </li></ul><ul><li>Allen’s Temporal Operators for reasoning about the sequence of events and building the site matrix </li></ul>
    15. 17. Events in the past <ul><li>eg finds production and deposition </li></ul>P94: has created (was created by) P25: moved (moved by) P26: moved to (was destination of) <<E53: Place>> Context The context, a place defined by a volume (deposits, structures) or surface (cuts) <<E19: Physical Object>> Find <<E65: Creation Event>> Object production <<E9: Move>> Finds deposition
    16. 18. Conclusions <ul><li>The event driven, object-oriented model is well-suited to the archaeological process. </li></ul><ul><li>An ontological base provided by the CRM provides semantic clarity and greater potential for interoperability. </li></ul>
    17. 19. Future Directions <ul><li>Completion of the model </li></ul><ul><li>Implementation </li></ul><ul><li>buy some paracetamol... </li></ul>