Pal gov.tutorial2.session12 2.architectural solutions for the integration issues


Published on

Published in: Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Pal gov.tutorial2.session12 2.architectural solutions for the integration issues

  1. 1. ‫أكاديمية الحكومة اإللكترونية الفلسطينية‬ The Palestinian eGovernment Academy www.egovacademy.psTutorial II: Data Integration and Open Information Systems Session 12.2 Architectural Solutions for the Integration Issues Dr. Mustafa Jarrar University of Birzeit PalGov © 2011 1
  2. 2. AboutThis tutorial is part of the PalGov project, funded by the TEMPUS IV program of theCommission of the European Communities, grant agreement 511159-TEMPUS-1-2010-1-PS-TEMPUS-JPHES. The project website: www.egovacademy.psProject Consortium: Birzeit University, Palestine University of Trento, Italy (Coordinator ) Palestine Polytechnic University, Palestine Vrije Universiteit Brussel, Belgium Palestine Technical University, Palestine Université de Savoie, France Ministry of Telecom and IT, Palestine University of Namur, Belgium Ministry of Interior, Palestine TrueTrust, UK Ministry of Local Government, PalestineCoordinator:Dr. Mustafa JarrarBirzeit University, P.O.Box 14- Birzeit, PalestineTelfax:+972 2 2982935 mjarrar@birzeit.eduPalGov © 2011 2
  3. 3. © Copyright NotesEveryone is encouraged to use this material, or part of it, but shouldproperly cite the project (logo and website), and the author of that part.No part of this tutorial may be reproduced or modified in any form or byany means, without prior written permission from the project, who havethe full copyrights on the material. Attribution-NonCommercial-ShareAlike CC-BY-NC-SAThis license lets others remix, tweak, and build upon your work non-commercially, as long as they credit you and license their new creationsunder the identical terms. PalGov © 2011 3
  4. 4. Tutorial Map Topic h Intended Learning Objectives Session 1: XML Basics and Namespaces 3A: Knowledge and Understanding Session 2: XML DTD’s 3 2a1: Describe tree and graph data models. Session 3: XML Schemas 3 2a2: Understand the notation of XML, RDF, RDFS, and OWL. 2a3: Demonstrate knowledge about querying techniques for data Session 4: Lab-XML Schemas 3 models as SPARQL and XPath. Session 5: RDF and RDFs 3 2a4: Explain the concepts of identity management and Linked data. Session 6: Lab-RDF and RDFs 3 2a5: Demonstrate knowledge about Integration &fusion of Session 7: OWL (Ontology Web Language) 3 heterogeneous data. Session 8: Lab-OWL 3B: Intellectual Skills Session 9: Lab-RDF Stores -Challenges and Solutions 3 2b1: Represent data using tree and graph data models (XML & Session 10: Lab-SPARQL 3 RDF). Session 11: Lab-Oracle Semantic Technology 3 2b2: Describe data semantics using RDFS and OWL. Session 12_1: The problem of Data Integration 1.5 2b3: Manage and query data represented in RDF, XML, OWL. Session 12_2: Architectural Solutions for the Integration Issues 1.5 2b4: Integrate and fuse heterogeneous data. Session 13_1: Data Schema Integration 1C: Professional and Practical Skills Session 13_2: GAV and LAV Integration 1 2c1: Using Oracle Semantic Technology and/or Virtuoso to store Session 13_3: Data Integration and Fusion using RDF 1 and query RDF stores. Session 14: Lab-Data Integration and Fusion using RDF 3D: General and Transferable Skills 2d1: Working with team. Session 15_1: Data Web and Linked Data 1.5 2d2: Presenting and defending ideas. Session 15_2: RDFa 1.5 2d3: Use of creativity and innovation in problem solving. 2d4: Develop communication skills and logical reasoning abilities. Session 16: Lab-RDFa 3 PalGov © 2011 4
  5. 5. Module ILOsAfter completing this module students will be able to: - Explain different architectural solutions to the problem of data integration. PalGov © 2011 5
  6. 6. Architectural Solutions for the Integration Issues• Two families of solutions for the integration issue: – Application-driven Integration • Various types of middleware (e.g. Web Services, Remote Procedure Call (RPC), Publish & Subscribe) that achieve reconciliation through application to middleware communication – Data-driven Integration • Various types of data reconciliation and integration – Consolidation – Data Warehouse – Data Integration PalGov © 2011 6
  7. 7. Architectures of application-driven Integratione.g., Service Oriented Architecture AS AS SS SS MSG-1 .. . MSG-N.. . .. . SS SS SS SS Legend AS AS AS AS SS = Security Server AS = Adapter Server MSG = Data Message PalGov © 2011 7
  8. 8. Architectures of application-driven Integration Source: Carlo Batini e.g., Publish-Subscribe Architecture Typical application-driven integration architecture for integration of updates. Update of an object O 1 2 Middleware 5 7 6 4 3Application 1 Source 1 Application 2 Application n Source n Source 2 Subscribes Publishes PalGov © 2011 8
  9. 9. Information Integration Architectures Source: Carlo Batini ConsolidationSource 1 Source 1 Source 2 Unique DBSource 2 Source n ….. New architectureSource n once for all PalGov © 2011 9
  10. 10. Information Integration Architectures Source: Carlo BatiniData Warehouse Source 1 Data Warehouse middleware Source 2 Unique DB ….. Source n New architecture: New data base periodically updated PalGov © 2011 10
  11. 11. Information Integration Architectures Source: Carlo Batini Virtual Data Integration LocalSource 1 schema Mediator Local schema Local Local Global schema Local schemaSource 2 schema schema ….. LocalSource n schema No new data base! New architecture PalGov © 2011 11
  12. 12. Additional Reading The integration problem… Source: Carlo Batini Registry Source 1 of clients 1Registryof clients 2 Source 2 Which kind of New Retail integration? architecture Source 3 sales On line How to decide? sales Source 4 ….. Other Source n PalGov © 2011 12
  13. 13. Additional Reading Criteria to be adopted Source: Carlo Batini• autonomy, the degree of independence between the different data base administrators in their design choices;• relevance of historical data, and consequent need to periodically store new data without deleting the old ones;• query complexity, in terms of amount of data and tables visited and number of operators on them, and consequent time complexity in query execution;• relevance of currency in queries, the need for queries to extract current data;• economic value of integration, the relevance of having integrated information in input for business operational and decisional processes in order to produce effective outputs; PalGov © 2011 13
  14. 14. Additional Reading Criteria to be adopted Source: Carlo Batini• volatility of sources, frequency of adding or deleting sources, and frequency of change of source schemas;• relevance of queries w.r.t transactions, relative importance and frequency of queries with respect to changes in data;• management complexity, the effort to be spent in management activities related to databases and hw-sw infrastructures, due to the corresponding complexity of the organizations using the data bases;• costs of heterogeneity, hidden and explicit costs related to business processes that are due to making use of heterogeneous data. PalGov © 2011 14
  15. 15. References• Carlo Batini: Course on Data Integration. BZU IT Summer School 2011.• Stefano Spaccapietra: Information Integration. Presentation at the IFIP Academy. Porto Alegre. 2005.• Chris Bizer: The Emerging Web of Linked Data. Presentation at SRI International, Artificial Intelligence Center. Menlo Park, USA. 2009. PalGov © 2011 15