Tdwg Ontology 03.Key
by rogerhyam on May 13, 2009
- 1,318 views
A discussion of the TDWG Ontology development process and what is planned for 2009
A discussion of the TDWG Ontology development process and what is planned for 2009
Accessibility
Categories
Tags
Upload Details
Uploaded via SlideShare as Apple Keynote
Usage Rights
© All Rights Reserved
Statistics
- Favorites
- 1
- Downloads
- 5
- Comments
- 0
- Embed Views
- Views on SlideShare
- 1,219
- Total Views
- 1,318
I hope also that this brief talk is inline with what you want to discuss and that it at least gives some background.
By 2004 there had been a change in focus to what could be called data exchange standards. These mainly took the form of XML document structures specified in XML Schema.
Because of the nature of biodiversity data (particularly taxonomic data) there was an overlap between these schemas. Taxon names and specimens, for example, were defined in both ABCD and DarwinCore. It proved very difficult, if not actually impossible, to incorporate “concepts” from one schema into another both at a syntactic and semantic level.
In 2005 I was given a 2.5 year contract as TDWG Standards Architect. Part of my role was to try and bring some integration across TDWG.
By 2004 there had been a change in focus to what could be called data exchange standards. These mainly took the form of XML document structures specified in XML Schema.
Because of the nature of biodiversity data (particularly taxonomic data) there was an overlap between these schemas. Taxon names and specimens, for example, were defined in both ABCD and DarwinCore. It proved very difficult, if not actually impossible, to incorporate “concepts” from one schema into another both at a syntactic and semantic level.
In 2005 I was given a 2.5 year contract as TDWG Standards Architect. Part of my role was to try and bring some integration across TDWG.
By 2004 there had been a change in focus to what could be called data exchange standards. These mainly took the form of XML document structures specified in XML Schema.
Because of the nature of biodiversity data (particularly taxonomic data) there was an overlap between these schemas. Taxon names and specimens, for example, were defined in both ABCD and DarwinCore. It proved very difficult, if not actually impossible, to incorporate “concepts” from one schema into another both at a syntactic and semantic level.
In 2005 I was given a 2.5 year contract as TDWG Standards Architect. Part of my role was to try and bring some integration across TDWG.
By 2004 there had been a change in focus to what could be called data exchange standards. These mainly took the form of XML document structures specified in XML Schema.
Because of the nature of biodiversity data (particularly taxonomic data) there was an overlap between these schemas. Taxon names and specimens, for example, were defined in both ABCD and DarwinCore. It proved very difficult, if not actually impossible, to incorporate “concepts” from one schema into another both at a syntactic and semantic level.
In 2005 I was given a 2.5 year contract as TDWG Standards Architect. Part of my role was to try and bring some integration across TDWG.
I'm going to talk through this diagram for a few minutes as background.
The blue areas on the diagram are XML Schema based exchange standards.
The green areas Semantic web based.
The yellow areas specific to the TDWG TAPIR protocol and represent a mapping (using what are called TAPIR output models) between XML documents and XML serialised RDF in the green areas.
This diagram represents an attempt to “square the circle” and adopt semantic web technologies without letting go of XML Schema based exchange standards.
There were good reasons for taking this approach.
●Firstly there was considerable reluctance in the community to leaving XML Schemas behind.
●Data sharing networks using the XML schemas DiGIR and BioCASe and their replacement TAPIR were in place.
On reflection this is a bad approach because it is very complicated. It involves mapping XML document structures to XML serialisations of RDF. To specify a XML document in XSD that is valid XML RDF is very complex
I am becoming increasingly of the opinion that semantic technologies are the best way forward and this kind of binding to XML Schema is a mistake .
●TDWG Ontology
●LSID Vocabularies
From the start it was realised that we needed some form of ontology. Develop such an ontology is time consuming and complex.
At the same time as the ontology was being developed TDWG was actively promoting LSIDs as the preferred Globally Unique Identifier technology.
The default metadata return type for LSIDs is RDF. We therefore needed some form of vocabulary for the RDF elements returned.
Top down modeling of the ontology was not going to provide the application level classes quickly enough.
The solution to this problem was to split the modeling effort into two parts. The LSID Vocabulary was the application level stuff. The TDWG Ontology was the top down stuff.
Some of the LSID classes are in use with data providers such as IPNI and Index Fungorum. I am not aware or the top-down higher level classes being is use.
You can still see the pages that render the existing OWL files using XSLT on the TDWG site.
Meanwhile work has continued on the DarwinCore exchange standard that was never formally ratified. This now includes and RDF rendition and follows the Dublin Core model as closely as possible.
This is not in the TDWG ontology names space but could be moved there or aliased from there.
PESI funds me to work on standards implementation for that project and as part of that work I will update the TDWG ontology.
You can still see the pages that render the existing OWL files using XSLT on the TDWG site.
Meanwhile work has continued on the DarwinCore exchange standard that was never formally ratified. This now includes and RDF rendition and follows the Dublin Core model as closely as possible.
This is not in the TDWG ontology names space but could be moved there or aliased from there.
PESI funds me to work on standards implementation for that project and as part of that work I will update the TDWG ontology.
You can still see the pages that render the existing OWL files using XSLT on the TDWG site.
Meanwhile work has continued on the DarwinCore exchange standard that was never formally ratified. This now includes and RDF rendition and follows the Dublin Core model as closely as possible.
This is not in the TDWG ontology names space but could be moved there or aliased from there.
PESI funds me to work on standards implementation for that project and as part of that work I will update the TDWG ontology.
You can still see the pages that render the existing OWL files using XSLT on the TDWG site.
Meanwhile work has continued on the DarwinCore exchange standard that was never formally ratified. This now includes and RDF rendition and follows the Dublin Core model as closely as possible.
This is not in the TDWG ontology names space but could be moved there or aliased from there.
PESI funds me to work on standards implementation for that project and as part of that work I will update the TDWG ontology.
All files will be moved to being OWL DL and editable using Protege 4. I now believe Protege is stable enough and usable enough to do this.
We will take a structured approach to ontology modeling where we have ‘Bricks’ and ‘Mortar’ ontologies.
We will need to change the way the ontologies are visualized on line as it currently relies on a specific XML serialization of the OWL. We may use the OWL Doc plugin for Protege - I’d be grateful for feedback on this.
We will need to change some namespaces but changes are likely to be minor.
This new organisation is still up for discussion
All files will be moved to being OWL DL and editable using Protege 4. I now believe Protege is stable enough and usable enough to do this.
We will take a structured approach to ontology modeling where we have ‘Bricks’ and ‘Mortar’ ontologies.
We will need to change the way the ontologies are visualized on line as it currently relies on a specific XML serialization of the OWL. We may use the OWL Doc plugin for Protege - I’d be grateful for feedback on this.
We will need to change some namespaces but changes are likely to be minor.
This new organisation is still up for discussion
All files will be moved to being OWL DL and editable using Protege 4. I now believe Protege is stable enough and usable enough to do this.
We will take a structured approach to ontology modeling where we have ‘Bricks’ and ‘Mortar’ ontologies.
We will need to change the way the ontologies are visualized on line as it currently relies on a specific XML serialization of the OWL. We may use the OWL Doc plugin for Protege - I’d be grateful for feedback on this.
We will need to change some namespaces but changes are likely to be minor.
This new organisation is still up for discussion
All files will be moved to being OWL DL and editable using Protege 4. I now believe Protege is stable enough and usable enough to do this.
We will take a structured approach to ontology modeling where we have ‘Bricks’ and ‘Mortar’ ontologies.
We will need to change the way the ontologies are visualized on line as it currently relies on a specific XML serialization of the OWL. We may use the OWL Doc plugin for Protege - I’d be grateful for feedback on this.
We will need to change some namespaces but changes are likely to be minor.
This new organisation is still up for discussion
All files will be moved to being OWL DL and editable using Protege 4. I now believe Protege is stable enough and usable enough to do this.
We will take a structured approach to ontology modeling where we have ‘Bricks’ and ‘Mortar’ ontologies.
We will need to change the way the ontologies are visualized on line as it currently relies on a specific XML serialization of the OWL. We may use the OWL Doc plugin for Protege - I’d be grateful for feedback on this.
We will need to change some namespaces but changes are likely to be minor.
This new organisation is still up for discussion
There are two distinct types of ontology files.
Bricks are totally self contained and do not import or reference other ontology elements.
Mortars are used to join existing Bricks into more complex ontologies.
The aim is to maximize re-use of Bricks by making sure they entail very little.
There are two distinct types of ontology files.
Bricks are totally self contained and do not import or reference other ontology elements.
Mortars are used to join existing Bricks into more complex ontologies.
The aim is to maximize re-use of Bricks by making sure they entail very little.
There are two distinct types of ontology files.
Bricks are totally self contained and do not import or reference other ontology elements.
Mortars are used to join existing Bricks into more complex ontologies.
The aim is to maximize re-use of Bricks by making sure they entail very little.
There are two distinct types of ontology files.
Bricks are totally self contained and do not import or reference other ontology elements.
Mortars are used to join existing Bricks into more complex ontologies.
The aim is to maximize re-use of Bricks by making sure they entail very little.
There are two distinct types of ontology files.
Bricks are totally self contained and do not import or reference other ontology elements.
Mortars are used to join existing Bricks into more complex ontologies.
The aim is to maximize re-use of Bricks by making sure they entail very little.
There are two distinct types of ontology files.
Bricks are totally self contained and do not import or reference other ontology elements.
Mortars are used to join existing Bricks into more complex ontologies.
The aim is to maximize re-use of Bricks by making sure they entail very little.
PESI is the project that is allowing me to re-engage with the ontology efforts.
PESI’s aim is to provide an infrastructure for the taxonomy of European organisms.
From a semantic web point of view this could be viewed as a species ontology for Europe. We should certainly be able to expose data in a form that can be consumed on the semantic web.
This will involve producing some useful ontologies of related data such as geographic regions and occurrence statuses such as “winter migrant”
PESI is a real project with real deliverables and we need to be careful that we deliver data in a way that can be consumed by real users. Although this may be in line with semantic web technologies it may also involve compromises. There is an element of research in this.
Read more about PESI here: http://www.eu-nomen.eu/pesi/
PESI is the project that is allowing me to re-engage with the ontology efforts.
PESI’s aim is to provide an infrastructure for the taxonomy of European organisms.
From a semantic web point of view this could be viewed as a species ontology for Europe. We should certainly be able to expose data in a form that can be consumed on the semantic web.
This will involve producing some useful ontologies of related data such as geographic regions and occurrence statuses such as “winter migrant”
PESI is a real project with real deliverables and we need to be careful that we deliver data in a way that can be consumed by real users. Although this may be in line with semantic web technologies it may also involve compromises. There is an element of research in this.
Read more about PESI here: http://www.eu-nomen.eu/pesi/
PESI is the project that is allowing me to re-engage with the ontology efforts.
PESI’s aim is to provide an infrastructure for the taxonomy of European organisms.
From a semantic web point of view this could be viewed as a species ontology for Europe. We should certainly be able to expose data in a form that can be consumed on the semantic web.
This will involve producing some useful ontologies of related data such as geographic regions and occurrence statuses such as “winter migrant”
PESI is a real project with real deliverables and we need to be careful that we deliver data in a way that can be consumed by real users. Although this may be in line with semantic web technologies it may also involve compromises. There is an element of research in this.
Read more about PESI here: http://www.eu-nomen.eu/pesi/
PESI is the project that is allowing me to re-engage with the ontology efforts.
PESI’s aim is to provide an infrastructure for the taxonomy of European organisms.
From a semantic web point of view this could be viewed as a species ontology for Europe. We should certainly be able to expose data in a form that can be consumed on the semantic web.
This will involve producing some useful ontologies of related data such as geographic regions and occurrence statuses such as “winter migrant”
PESI is a real project with real deliverables and we need to be careful that we deliver data in a way that can be consumed by real users. Although this may be in line with semantic web technologies it may also involve compromises. There is an element of research in this.
Read more about PESI here: http://www.eu-nomen.eu/pesi/
PESI is the project that is allowing me to re-engage with the ontology efforts.
PESI’s aim is to provide an infrastructure for the taxonomy of European organisms.
From a semantic web point of view this could be viewed as a species ontology for Europe. We should certainly be able to expose data in a form that can be consumed on the semantic web.
This will involve producing some useful ontologies of related data such as geographic regions and occurrence statuses such as “winter migrant”
PESI is a real project with real deliverables and we need to be careful that we deliver data in a way that can be consumed by real users. Although this may be in line with semantic web technologies it may also involve compromises. There is an element of research in this.
Read more about PESI here: http://www.eu-nomen.eu/pesi/
PESI is the project that is allowing me to re-engage with the ontology efforts.
PESI’s aim is to provide an infrastructure for the taxonomy of European organisms.
From a semantic web point of view this could be viewed as a species ontology for Europe. We should certainly be able to expose data in a form that can be consumed on the semantic web.
This will involve producing some useful ontologies of related data such as geographic regions and occurrence statuses such as “winter migrant”
PESI is a real project with real deliverables and we need to be careful that we deliver data in a way that can be consumed by real users. Although this may be in line with semantic web technologies it may also involve compromises. There is an element of research in this.
Read more about PESI here: http://www.eu-nomen.eu/pesi/
Because of this TDWG have been through a process of selecting a GUID technology and have chosen LSIDs.
There are issues with this. Personally I believe they are redundant because of the need to proxy them for use in any client software and especially in semantically aware applications.
The approach TDWG have taken is not unusual. The publishing industry have taken a similar approach with DOI and they have similar issues.
A new GBIF task group is being established to look into the problems and try and come up with a solution - even if this involves a centralized authority.
In my view any outcome must support the Linked Data paradigm
Personally HTTP URIs are my preferred standard.
Because of this TDWG have been through a process of selecting a GUID technology and have chosen LSIDs.
There are issues with this. Personally I believe they are redundant because of the need to proxy them for use in any client software and especially in semantically aware applications.
The approach TDWG have taken is not unusual. The publishing industry have taken a similar approach with DOI and they have similar issues.
A new GBIF task group is being established to look into the problems and try and come up with a solution - even if this involves a centralized authority.
In my view any outcome must support the Linked Data paradigm
Personally HTTP URIs are my preferred standard.
Because of this TDWG have been through a process of selecting a GUID technology and have chosen LSIDs.
There are issues with this. Personally I believe they are redundant because of the need to proxy them for use in any client software and especially in semantically aware applications.
The approach TDWG have taken is not unusual. The publishing industry have taken a similar approach with DOI and they have similar issues.
A new GBIF task group is being established to look into the problems and try and come up with a solution - even if this involves a centralized authority.
In my view any outcome must support the Linked Data paradigm
Personally HTTP URIs are my preferred standard.
Because of this TDWG have been through a process of selecting a GUID technology and have chosen LSIDs.
There are issues with this. Personally I believe they are redundant because of the need to proxy them for use in any client software and especially in semantically aware applications.
The approach TDWG have taken is not unusual. The publishing industry have taken a similar approach with DOI and they have similar issues.
A new GBIF task group is being established to look into the problems and try and come up with a solution - even if this involves a centralized authority.
In my view any outcome must support the Linked Data paradigm
Personally HTTP URIs are my preferred standard.
Because of this TDWG have been through a process of selecting a GUID technology and have chosen LSIDs.
There are issues with this. Personally I believe they are redundant because of the need to proxy them for use in any client software and especially in semantically aware applications.
The approach TDWG have taken is not unusual. The publishing industry have taken a similar approach with DOI and they have similar issues.
A new GBIF task group is being established to look into the problems and try and come up with a solution - even if this involves a centralized authority.
In my view any outcome must support the Linked Data paradigm
Personally HTTP URIs are my preferred standard.
Because of this TDWG have been through a process of selecting a GUID technology and have chosen LSIDs.
There are issues with this. Personally I believe they are redundant because of the need to proxy them for use in any client software and especially in semantically aware applications.
The approach TDWG have taken is not unusual. The publishing industry have taken a similar approach with DOI and they have similar issues.
A new GBIF task group is being established to look into the problems and try and come up with a solution - even if this involves a centralized authority.
In my view any outcome must support the Linked Data paradigm
Personally HTTP URIs are my preferred standard.
TDWG is the place to do it if you want semantic definition of useful things in the biodiversity informatics.
Previously we have taken complex hybrid approaches that were part of a growing experience. I believe we should have a clearer separation of semantic and other technologies now.
Stuff is going to happen this year and you can help shape it.
We need client applications outside of taxonomy to consume this stuff or there is no point in doing it.
TDWG is the place to do it if you want semantic definition of useful things in the biodiversity informatics.
Previously we have taken complex hybrid approaches that were part of a growing experience. I believe we should have a clearer separation of semantic and other technologies now.
Stuff is going to happen this year and you can help shape it.
We need client applications outside of taxonomy to consume this stuff or there is no point in doing it.
TDWG is the place to do it if you want semantic definition of useful things in the biodiversity informatics.
Previously we have taken complex hybrid approaches that were part of a growing experience. I believe we should have a clearer separation of semantic and other technologies now.
Stuff is going to happen this year and you can help shape it.
We need client applications outside of taxonomy to consume this stuff or there is no point in doing it.
TDWG is the place to do it if you want semantic definition of useful things in the biodiversity informatics.
Previously we have taken complex hybrid approaches that were part of a growing experience. I believe we should have a clearer separation of semantic and other technologies now.
Stuff is going to happen this year and you can help shape it.
We need client applications outside of taxonomy to consume this stuff or there is no point in doing it.
I find myself explaining the same things over and over again in emails and over beer so I have recently decided to capture such thoughts in blog posts.
These are very informal but may be useful background reading on GUIDs.
Thanks for your time and please email me any comments. I will try and answer them during the meeting.