Your SlideShare is downloading. ×
Jarrar: Architectural Solutions in Data Integration
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Jarrar: Architectural Solutions in Data Integration

179
views

Published on

There ara two families of solutions for the integration issue: …

There ara two families of solutions for the integration issue:
- Application-driven Integration
- Data-driven Integration

Published in: Education, Technology

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
179
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Mustafa Jarrar Lecture Notes, Web Data Management (MCOM7348) University of Birzeit, Palestine 1st Semester, 2013 Architectural Solutions in Data Integration Dr. Mustafa Jarrar University of Birzeit mjarrar@birzeit.edu www.jarrar.info Jarrar © 2013 1
  • 2. Watch this lecture and download the slides from http://jarrar-courses.blogspot.com/2013/11/web-data-management.html Jarrar © 2013 2
  • 3. Different Solutions Two families of solutions for the integration issue: –  Application-driven Integration •  Various types of middleware (e.g. Web Services, Remote Procedure Call (RPC), Publish & Subscribe) that achieve reconciliation through application to middleware communication –  Data-driven Integration •  Various types of data reconciliation and integration –  Consolidation –  Data Warehouse –  Data Integration Jarrar © 2013 3
  • 4. Architectures of application-driven Integration Service Oriented Architecture AS SS .  .  . MSG-­‐1 .  .  . enterprise service bus SS AS SS AS Jarrar © 2013 AS SS MSG-­‐N SS AS .  .  . SS AS Legend SS  =  Security  Server AS  =  Adapter  Server MSG  = Data  Message 4
  • 5. Architectures of application-driven Integration Source: Carlo Batini Publish-Subscribe Architecture Update of an object O 1 2 Middleware 5 7 Application 1 6 Source 1 Application 2 4 Source 2 Subscribes 3 Application n Source n Publishes Typical application-driven integration architecture for integration of updates. Jarrar © 2013 5
  • 6. Information Integration Architectures Source: Carlo Batini Consolidation Source 1 Source 1 Source 2 ….. Source 2 Unique DB Source n New architecture once for all Source n Jarrar © 2013 6
  • 7. Information Integration Architectures Source: Carlo Batini Data Warehouse Source 1 Source 2 Data Warehouse middleware Unique DB ….. Source n New database New architecture: periodically updated Jarrar © 2013 7
  • 8. Information Integration Architectures Source: Carlo Batini Virtual Data Integration Source 1 Local schema Mediator Source 2 Local schema Local schema Local schema Local schema Global schema ….. Source n Local schema New architecture No new database! Jarrar © 2013 8
  • 9. The integration problem… Source: Carlo Batini Registry of clients 1 Source 1 Registry of clients 2 Source 2 Retail sales On line sales Source 3 Which kind of integration? New architecture How to decide? Source 4 ….. Other Source n Jarrar © 2013 9
  • 10. Criteria to be adopted Source: Carlo Batini •  Autonomy, the degree of independence between the different database administrators in their design choices; •  Relevance of historical data, and consequent need to periodically store new data without deleting the old ones; •  Query complexity, in terms of amount of data and tables visited and number of operators on them, and consequent time complexity in query execution; •  Relevance of currency in queries, the need for queries to extract current data; •  Economic value of integration, the relevance of having integrated information in input for business operational and decisional processes in order to produce effective outputs; Jarrar © 2013 10
  • 11. Criteria to be adopted Source: Carlo Batini •  Volatility of sources, frequency of adding or deleting sources, and frequency of change of source schemas; •  Relevance of queries w.r.t transactions, relative importance and frequency of queries with respect to changes in data; •  Management complexity, the effort to be spent in management activities related to databases and hw-sw infrastructures, due to the corresponding complexity of the organizations using the data bases; •  Costs of heterogeneity, hidden and explicit costs related to business processes that are due to making use of heterogeneous data. Jarrar © 2013 11
  • 12. References and Acknowledge •  Carlo Batini: Course on Data Integration. BZU IT Summer School 2011. •  Stefano Spaccapietra: Information Integration. Presentation at the IFIP Academy. Porto Alegre. 2005. •  Chris Bizer: The Emerging Web of Linked Data. Presentation at SRI International, Artificial Intelligence Center. Menlo Park, USA. 2009. Thanks to Anton Deik for helping me preparing this lecture Jarrar © 2013 12

×