Knowledge Engineering Course (SCOM7348)
                                                                         University of Birzeit, Palestine
                                                                                       December, 2012




                                   Synthesis Paper Talk


                            Ontology-Based
                            Data Integration
                                 Ali Ahmad Al Jadaa
                              Master of Computing, Birzeit University
                                         Ali.ps@live.com



This is a student talk, at the Knowledge Engineering course, each student is is
                    asked to present his/her synthesis paper.
      Course Page: http://jarrar-courses.blogspot.com/2011/09/knowledgeengineering-fall2011.html


                                                                                                        1
Abstract

• We are need to obtain information from
  several local or external sources, Each
  source may be built in different ways, so we
  will face many various conflicts in the
  meaning or structure and other conflicts.
  We'll see also examples show why need
  data integration .
• Ontology provides effective solutions to data
  integration in different ways, and which you
  will highlight about it
                                              2
What is Data

• Data is a collection of facts, such as values
  or measurements. It can be numbers,
  words, measurements, observations or
  even just descriptions of things.




                                              3
What is data integration

• Data coming from different sources
  and providing users with a unified view
  of these data .




                                            4
What is Semantic

• Understand the meaning .
• Relation between signifiers .
• Difference with other

                         ≠

                                  5
What is Ontology

• Set of concepts within a domain, and the
  relationships between pairs of concepts. It
  can be used to model a domain and support
  reasoning about entities .
• Dictionary Computers can be understand
• List of things that exist
• Description of the kinds of entities there are
  and how they are related


                                              6
Why need data integration

• The world turns to a small village, and
  became a speed factor in the completion of
  transactions of the most important features
  of any state.




                                            7
example

• if any student want to travel to complete his study , he
  must visit many of ministries to prepare :

•   1 - General Certificate of Secondary Education of school.
•   2 - Ratification certificate from the Ministry of Education.
•   3 - Identification of the Ministry of the Interior.
•   4 - Valid passport of the Ministry of the Interior.
•   5 - Disease free certificate from the Ministry of Health .
•   6 - Guarantee from a bank




                                                               8
The problem of data integration

• Since the data come from different sources,
  and all data source built in a different way
  and different meaning, we will face many
  problems




                                            9
Name Heterogeneities

•  which mean different names for the same concepts
  (Synonyms) ,
• for example schema use (Code)

• and anther one use (Number) or (No.) ,




                                                      10
Meaning Heterogeneities

• which mean Same name for different
  concepts(Homonyms),for example schema
  use City as a Birth City , but another
  schema use it to mean work city




                                       11
Structure Heterogeneities

• which means that different information
  systems store their data in different
  structures ,




                                           12
Type Heterogeneities

• which mean same attribute in different data type
  ,example attribute "Gender" in schema use String data
  type ("Male"," Female"), but in another schema use
  Boolean data type (0,1).




                                                          13
Rules and Constraints Heterogeneities

• which mean different cardinalities in the same
  relationships, example in schema the Age of student
  between(18-25) year but in another one it's between
   (18-30)




                                                        14
Model Heterogeneities

Occurs when different databases adheres to
different data models,




                                             15
Service Oriented Architecture

• Is a set of principles and methodologies for
  designing and developing software in the
  form of interoperable services[1]




                                             16
Publish-Subscribe Architecture

• networking technologies and products
  enable a high degree of connectivity across
  a large number of computers, applications,
  and users[5]




                                            17
Consolidation

• involves capturing of data from multiple
  source systems and integrating into a single
  persistent data store. The latency of the
  information in the consolidated data store
  depends upon whether batch or real time
  data consolidation is being used and how
  often the updates are being applied to the
  data store.[2]



                                            18
Multibase system

• A multibase (multiple database) system
  allows the users to view the database
  through a single global schema ,simulating
  to users that a federated data base
  exists.[3]




                                           19
Data Warehouse

• Is a database used for reporting and data
  analysis[4]




                                              20
Federated systems (Virtual Data Integration)

• It is characterized by the existence of a
  federated schema which establishes the
  interface to this integrated system.[4]




                                              21
example




          22
Solution

• Can be solved with relational Views (A to B)
CREATE VIEW Men As
SELECT Code , Name FROM Person
WHERE Gender="Male"
CREATE VIEW Women As
SELECT Code , Name FROM Person
WHERE Gender="Female"
• Or can be solved with relational View (B to A)
CREATE VIEW (Code , Name , Gender) As
SELECT Code , Name ,"Male"
 FROM Men
UNION
SELECT Code , Name , "Female"
 FROM Women
                                                   23
References

1.   Bell_ Michael (2010). SOA Modeling Patterns for Service-Oriented Discovery and
     Analysis. Wiley & Sons. p. 390. ISBN 978-0-470-48197-4.

2.   Amit P. Sheth, and J.A. Larson, Federated Database Systems for Managing
     Distributed, Heterogeneous, and Autonomous Databases, ACM Computing
     Surveys,Vol 22, No. 3, pp. 183-236, September 1990.
3.   Manuel Garcia-Solaco, Felix Saltor, and Malu Castellanos, Semantic heterogeneity in
     multidatabase systems, In Object-oriented Multidatabase Systems: A Solution for
     dvanced Applications, Omran A. Bukhres and Ahmed K. Elmagarmid, editors, Prentice-
     Hall, 1996 Chapter 5, pp. 129-202.
4.   Oracle® Database Application Developer's Guide – Fundamentals 10g Release 1
     (10.1) Part Number B10795-01




                                                                                      24
Thank You




            25

ontology based- data_integration.ali_aljadaa.1125048

  • 1.
    Knowledge Engineering Course(SCOM7348) University of Birzeit, Palestine December, 2012 Synthesis Paper Talk Ontology-Based Data Integration Ali Ahmad Al Jadaa Master of Computing, Birzeit University Ali.ps@live.com This is a student talk, at the Knowledge Engineering course, each student is is asked to present his/her synthesis paper. Course Page: http://jarrar-courses.blogspot.com/2011/09/knowledgeengineering-fall2011.html 1
  • 2.
    Abstract • We areneed to obtain information from several local or external sources, Each source may be built in different ways, so we will face many various conflicts in the meaning or structure and other conflicts. We'll see also examples show why need data integration . • Ontology provides effective solutions to data integration in different ways, and which you will highlight about it 2
  • 3.
    What is Data •Data is a collection of facts, such as values or measurements. It can be numbers, words, measurements, observations or even just descriptions of things. 3
  • 4.
    What is dataintegration • Data coming from different sources and providing users with a unified view of these data . 4
  • 5.
    What is Semantic •Understand the meaning . • Relation between signifiers . • Difference with other ≠ 5
  • 6.
    What is Ontology •Set of concepts within a domain, and the relationships between pairs of concepts. It can be used to model a domain and support reasoning about entities . • Dictionary Computers can be understand • List of things that exist • Description of the kinds of entities there are and how they are related 6
  • 7.
    Why need dataintegration • The world turns to a small village, and became a speed factor in the completion of transactions of the most important features of any state. 7
  • 8.
    example • if anystudent want to travel to complete his study , he must visit many of ministries to prepare : • 1 - General Certificate of Secondary Education of school. • 2 - Ratification certificate from the Ministry of Education. • 3 - Identification of the Ministry of the Interior. • 4 - Valid passport of the Ministry of the Interior. • 5 - Disease free certificate from the Ministry of Health . • 6 - Guarantee from a bank 8
  • 9.
    The problem ofdata integration • Since the data come from different sources, and all data source built in a different way and different meaning, we will face many problems 9
  • 10.
    Name Heterogeneities • which mean different names for the same concepts (Synonyms) , • for example schema use (Code) • and anther one use (Number) or (No.) , 10
  • 11.
    Meaning Heterogeneities • whichmean Same name for different concepts(Homonyms),for example schema use City as a Birth City , but another schema use it to mean work city 11
  • 12.
    Structure Heterogeneities • whichmeans that different information systems store their data in different structures , 12
  • 13.
    Type Heterogeneities • whichmean same attribute in different data type ,example attribute "Gender" in schema use String data type ("Male"," Female"), but in another schema use Boolean data type (0,1). 13
  • 14.
    Rules and ConstraintsHeterogeneities • which mean different cardinalities in the same relationships, example in schema the Age of student between(18-25) year but in another one it's between (18-30) 14
  • 15.
    Model Heterogeneities Occurs whendifferent databases adheres to different data models, 15
  • 16.
    Service Oriented Architecture •Is a set of principles and methodologies for designing and developing software in the form of interoperable services[1] 16
  • 17.
    Publish-Subscribe Architecture • networkingtechnologies and products enable a high degree of connectivity across a large number of computers, applications, and users[5] 17
  • 18.
    Consolidation • involves capturingof data from multiple source systems and integrating into a single persistent data store. The latency of the information in the consolidated data store depends upon whether batch or real time data consolidation is being used and how often the updates are being applied to the data store.[2] 18
  • 19.
    Multibase system • Amultibase (multiple database) system allows the users to view the database through a single global schema ,simulating to users that a federated data base exists.[3] 19
  • 20.
    Data Warehouse • Isa database used for reporting and data analysis[4] 20
  • 21.
    Federated systems (VirtualData Integration) • It is characterized by the existence of a federated schema which establishes the interface to this integrated system.[4] 21
  • 22.
  • 23.
    Solution • Can besolved with relational Views (A to B) CREATE VIEW Men As SELECT Code , Name FROM Person WHERE Gender="Male" CREATE VIEW Women As SELECT Code , Name FROM Person WHERE Gender="Female" • Or can be solved with relational View (B to A) CREATE VIEW (Code , Name , Gender) As SELECT Code , Name ,"Male" FROM Men UNION SELECT Code , Name , "Female" FROM Women 23
  • 24.
    References 1. Bell_ Michael (2010). SOA Modeling Patterns for Service-Oriented Discovery and Analysis. Wiley & Sons. p. 390. ISBN 978-0-470-48197-4. 2. Amit P. Sheth, and J.A. Larson, Federated Database Systems for Managing Distributed, Heterogeneous, and Autonomous Databases, ACM Computing Surveys,Vol 22, No. 3, pp. 183-236, September 1990. 3. Manuel Garcia-Solaco, Felix Saltor, and Malu Castellanos, Semantic heterogeneity in multidatabase systems, In Object-oriented Multidatabase Systems: A Solution for dvanced Applications, Omran A. Bukhres and Ahmed K. Elmagarmid, editors, Prentice- Hall, 1996 Chapter 5, pp. 129-202. 4. Oracle® Database Application Developer's Guide – Fundamentals 10g Release 1 (10.1) Part Number B10795-01 24
  • 25.