Ontology For Data Integration

Some Thoughts Juan Esteva, Ph. D. . 751 Malena Dr., Ann Arbor, MI 48103 Tel: 734-786-0233 Cell 734-277-4962 Fax 734-821-0235 SkypeDrEsteva juan.esteva@Ajatella.com Ontology Data Integration For Competitive Decision Making

Not Just The Facts 3/4/2010 Juan Esteva, Ph. D. 2 “Good decisions are based on information that is analyzed and transformed into usable knowledge” Eileen Feretic

Information at the point of impact 3/4/2010 Juan Esteva, Ph. D. 3 “Information needs to be at the point of impact—at the front lines where people are making decisions. The right analysis needs to be done at the right place. It’s important for organizations to treat information as a strategic asset in order to optimize every decision, every process, everything they do.” AmbujGoyal,

Data in Silos 3/4/2010 Juan Esteva, Ph. D. 4 “One of the biggest challenges organizations face is the amount of data sitting in silos, too often, valuable data simply isn’t accessible or available.” Boris Evelson

Business Decisions for Competitive Advantage 3/4/2010 Juan Esteva, Ph. D. 5 “In today’s troubled economy and competitive business environment, making good decisions is a matter of survival. But good decisions aren’t based on gut feeling alone. They should be based on information gathered from multiple sources, which is then synthesized and analyzed to generate a road map of options and possible outcomes that transform data into usable knowledge” Eileen Feretic

Business Intelligence 3/4/2010 Juan Esteva, Ph. D. 6 Business Intelligence and now Business Analytics systems come into play [However,] it is hard to assemble [heterogeneous data and] disparate pieces of information in a way that provides the intelligence and insight needed to make good business decisions. Eileen Feretic Alas enter Ontology Data Integration.

Data Integration 3/4/2010 Juan Esteva, Ph. D. 7 Data integration provides the ability to manipulate data transparently across multiple data sources. Based on the architecture there are 2 systems: Central Data Integration A central data integration system usually has a global schema, which provides the user with a uniform interface to access information stored in the data sources Peer-2-peer In contrast, in a peer-to-peer data integration system, there are no global points of control on the data sources (or peers). Instead, any peer can accept user queries for the information distributed in the whole system.

Common Approaches for Data Integration 3/4/2010 Juan Esteva, Ph. D. 8 Global-as-View In the GaV approach, every entity in the global schema is associated with a view over the source local schema. Therefore querying strategies are simple, but the evolution of the local source schemas is not easily supported. Local-as-View On the contrary, the LaV approach permits changes to source schemas without affecting the global schema, since the local schemas are defined as views over the global schema, but query processing can be complex.

Data Heterogeneity 3/4/2010 Juan Esteva, Ph. D. 9 Data sources can be heterogeneous in: Syntax Syntactic heterogeneity is caused by the use of different models or languages. Schema Schematic heterogeneity results from structural differences. Semantics Semantic heterogeneity is caused by different meanings or interpretations of data in various contexts To achieve data interoperability, the issues posed by data heterogeneity need to be eliminated

Possible Solutions 3/4/2010 Juan Esteva, Ph. D. 10 The advent of XML has created a syntactic platform for Web data standardization and exchange. However, schematic data heterogeneity may persist, depending on the XML schemas used (e.g., nesting hierarchies). Likewise, semantic heterogeneity may persist even if both syntactic and schematic heterogeneities do not occur (e.g., naming concepts differently). We should be concerned with solving all three kinds of heterogeneities by bridging syntactic, schematic, and semantic heterogeneities across different sources.

Semantic Data Integration Using Ontologies 3/4/2010 Juan Esteva, Ph. D. 11 We call semantic data integration the process of using a conceptual representation of the data and of their relationships to eliminate possible heterogeneities. At the heart of semantic data integration is the concept of ontology, which is an explicit specification of a shared conceptualization

Ontology & Data Integration 3/4/2010 Juan Esteva, Ph. D. 12 Metadata Representation. Metadata (i.e., source schemas) in each data source can be explicitly represented by a local ontology, using a single language. Global Conceptualization. The global ontology provides a conceptual view over the schematically-heterogeneous source schemas. Support for High-level Queries. Given a high-level view of the sources, as provided by a global ontology, the user can formulate a query without specific knowledge of the different data sources. The query is then rewritten into queries over the sources, based on the semantic mappings between the global and local ontologies. Declarative Mediation. Query processing in a hybrid peer-to-peer system uses the global ontology as a declarative mediator for query rewriting between peers. Mapping Support. A thesaurus, formalized in terms of an ontology, can be used for the mapping process to facilitate its automation.

What do we need? 3/4/2010 Juan Esteva, Ph. D. 13 Increase search capabilities From discovery to reasoning Increasing metadata as to provide strong semantics From glossaries to ontologies Consequently, moving from syntactic interoperability to structural interoperability and finally to semantic interoperability

Graphically the model progression will be [2] 3/4/2010 Juan Esteva, Ph. D. 14 The point of this graph is that Increasing Metadata (from glossaries to ontologies) is highly correlated with Increasing Search Capability (from discovery to reasoning).

Juan Esteva, Ph. D. 3/4/2010 15 References

References 3/4/2010 Juan Esteva, Ph. D. 16 Applying 4D ontologies to Enterprise Architecture, Matthew West, Shell Corp. FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot, 2005 The Role of Ontologies in Data Integration, Isabel F. Cruz Huiyong Xiao

Topic Maps 3/4/2010 Juan Esteva, Ph. D. 17 Topic Maps is a standard for the representation and interchange of knowledge, with an emphasis on the findability of information. The ISO standard is formally known as ISO/IEC 13250:2003. A topic map represents information using topics (representing any concept, from people, countries, and organizations to software modules, individual files, and events), associations (representing the relationships between topics), and occurrences (representing information resources relevant to a particular topic).

SKOS 3/4/2010 Juan Esteva, Ph. D. 18 Simple Knowledge Organization System (SKOS) SKOS is a common data model for sharing and linking knowledge organization systems via the Web.

RDF 3/4/2010 Juan Esteva, Ph. D. 19 Resource Description Language RDF RDF is a standard model for data interchange on the Web. RDF has features that facilitate data merging even if the underlying schemas differ, and it specifically supports the evolution of schemas over time without requiring all the data consumers to be changed.

OWL 3/4/2010 Juan Esteva, Ph. D. 20 Web Ontology Language OWL is a Semantic Web language designed to represent rich and complex knowledge about things, groups of things, and relations between things. OWL is a computational logic-based language such that knowledge expressed in OWL can be reasoned with by computer programs either to verify the consistency of that knowledge or to make implicit knowledge explicit. OWL documents, known as ontologies, can be published in the World Wide Web and may refer to or be referred from other OWL ontologies. OWL is part of the W3C’s Semantic Web technology stack, which includes RDF, RDFS, SPARQL, etc.

Ontology For Data Integration

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (6)

Similar to Ontology For Data Integration

Similar to Ontology For Data Integration (20)

Recently uploaded

Recently uploaded (20)

Ontology For Data Integration