Semantic Integration of Relational Data Sources With Topic Maps


Published on

Data integration of heterogeneous data sources plays a major role in the development of modern knowledge management systems. Additional enrichment of this data with the use of ontologies opens up completely new possibilities in leveraging the use of semantic technologies, and combining information from existing information systems. This paper presents the architecture and prototype implementation of a semantic integration layer for transparent access to relational data sources through the use of Topic Maps.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Semantic Integration of Relational Data Sources With Topic Maps

  1. 1. Semantic Integration of Relational Data Sources with Topic Maps Use case providers: Rani Pinchuk, Thomas Neidhart, Bernard Valentin The research was done within the SATOPI Project, a co-funded activity with the European Space Agency ( ESA Contract N°: 21520/08/I/OL)
  2. 2. <ul><li>We have more and more data </li></ul><ul><li>The data is spread across heterogeneous data sources. </li></ul><ul><li>The data is not self-explanatory. </li></ul><ul><li>Copying the data to a better organized data store is no option. </li></ul>The Problem
  3. 3. <ul><li>Create a Semantic Integration Layer that provides: </li></ul><ul><ul><li>semantic annotation of existing data </li></ul></ul><ul><ul><li>a “live” view to data stored in heterogeneous data sources. </li></ul></ul><ul><ul><li>a merging of multiple data sources in a clean and understandable way </li></ul></ul><ul><ul><li>a common interface to different types of data. </li></ul></ul>The Vision
  4. 4. <ul><li>Glacial Lake Outburst Flood (GLOF) happen when glaciers melt , and glacial lakes are formed next to them. </li></ul><ul><li>The glacial lakes are usually dammed by a natural dam made partly from ice . This dams are unstable. </li></ul><ul><li>When the water level goes up, the ice in the dams floats and causing the dam to break . </li></ul><ul><li>The resulted floods can be very intense. For example, in a GLOF event happened in 1985, the water from Dig Tsho Lake went downhill in a flood that lasted 4-6 hours, and with a flow of 1600 to 2350 cubic meters per second. </li></ul>The Use Case – GLOFs in the Himalayas
  5. 5. <ul><li>The phenomena of GLOF events depend on many factors such as weather, topography and characteristics of the glaciers, the glacier lake and its dam. </li></ul><ul><li>In order to identify precursors for a GLOF event the researcher has to collect data from different resources: </li></ul><ul><ul><li>Different remote Earth Observation sensors each might have its own interface. </li></ul></ul><ul><ul><li>Data which is already collected about the glacier lake and similar glacier lakes </li></ul></ul><ul><ul><li>Weather data, etc. </li></ul></ul><ul><li> The process of accessing and collecting the data slows down the research. </li></ul>The Use Case – GLOFs in the Himalayas
  6. 6. Why Topic Maps ? Merging Data Reusing Data Processing Data Navigating Data Accessing Data
  7. 7. <ul><li>Topic Maps technology provides the ability to represent knowledge in an informal, natural way – the way we, humans, grasp knowledge. </li></ul><ul><li>RDF/OWL is designed for being processed by machines, and therefore is much more formal, more machine-friendly and less human-friendly. </li></ul>Topic Maps versus The Semantic Web
  8. 8. The Architecture (1) <ul><li>The client accesses the data through a Topic Maps interface – using TMQL or TMAPI. </li></ul><ul><li>The Topic Maps Engine uses the Ontology Definition and the Mapping Definition files to find where from the data should be retrieved. </li></ul><ul><li>If the data is located in the external data stores, the Topic Maps Engine will retrieve this data. </li></ul>
  9. 9. The Architecture (2) <ul><li>The Ontology Definition is an ordinary topic map, containing well-defined topics and types -> reflects application logic. </li></ul><ul><li>For each attached data source, a separate connector is defined (with a mapping definition file). </li></ul><ul><li>Additionally, an arbitrary number of topic maps can be loaded. </li></ul><ul><li>Associations can be defined between any combination of attached connectors and pre-loaded topic maps. </li></ul><ul><li>The final topic map that is visible to the user through the available interfaces is the sum of all attached connectors + loaded topic maps + their defined associations. </li></ul>
  10. 10. The Ontology (1)
  11. 11. The Ontology (2)
  12. 12. The Datastore Connectors Datastore Query Ontology (XTM) <topicMap xmlns=&quot;; version=&quot;2.0&quot;> <itemIdentity href=&quot;; /> <topic id=&quot;glacier&quot; > <name><value>Glacier</value></name> <occurrence> <type> <topicRef href=&quot;#local-id&quot; /> </type> <resourceData datatype=&quot;string&quot; /> </occurrence> Mapping Definition (CTM Template) %ctm 1.0 %prefix database <glaciers.db> %prefix tablename <gka> topic - &quot;${NAME}&quot;; isa glacier; ^ ${TMID}; local-id: &quot;${LOCAL_ID}&quot;; area: &quot;${AREA_KM2}&quot; . is-located-in(location: topic, host: &quot;${COUNTRY_TMID}&quot;) TMQL TMAPI TopiEngine TMAPI Datastore Connector
  13. 13. The User Interface: Search Pages
  14. 14. The User Interface: Search Result Lists
  15. 15. The User Interface: Description Pages
  16. 16. <ul><li>Topic Maps is a useful data integration technology. </li></ul><ul><li>Modeling of application structure/logic using Topic Maps is efficient and straight-forward </li></ul><ul><ul><li> easily understood by domain experts. </li></ul></ul><ul><li>Common interface for heterogeneous data greatly reduced complexity and amount of time for app. development. </li></ul><ul><ul><li> Focus on Application Logic rather than technical integration details </li></ul></ul>Conclusion
  17. 17. <ul><li>Improve existing Topic Maps interfaces (like TMAPI) to better fit for specific use-cases: </li></ul><ul><ul><li>Should we have TMAPI with sessions? </li></ul></ul>Open Issues
  18. 18. Thank you Questions?