Lecture 10 Distributed Database Management Systems
Evolution of DDBMS <ul><li>Decentralized database management systems (DDBMS)  </li></ul><ul><ul><li>Interconnected compute...
DDBMS Advantages <ul><li>Data located near site with greatest demand </li></ul><ul><li>Faster data access </li></ul><ul><l...
DDBMS Disadvantages <ul><li>Complexity of management and control </li></ul><ul><li>Security </li></ul><ul><li>Lack of stan...
Distributed Processing <ul><li>Shares database’s logical processing among physically, networked independent sites </li></u...
Distributed Database <ul><li>Stores logically related database over physically independent sites </li></ul>Figure 10.2
Distributed Database  vs. Distributed Processing <ul><li>Distributed processing  </li></ul><ul><ul><li>Does not require di...
Functions of DDBMS <ul><li>Application/end user interface </li></ul><ul><li>Validation to analyze data requests </li></ul>...
Centralized Database Figure 10.3
Fully Distributed Database Management System Figure 10.4
DDBMS Components <ul><li>Computer workstations  </li></ul><ul><li>Network hardware and software components </li></ul><ul><...
Distributed Database Components Figure 10.5
DDBMS Protocols <ul><li>Interface with network to transport data and commands between DPs and TPs </li></ul><ul><li>Synchr...
Levels of Data and Process Distribution <ul><li>Database systems can be classified based on process distribution and data ...
Single-Site Processing, Single-Site Data (SPSD) <ul><li>All processing on single CPU or host computer </li></ul><ul><li>Al...
Single-Site Processing, Single-Site Data (con’t.) Figure 10.6
Multiple-Site Processing, Single-Site Data (MPSD) <ul><ul><li>Requires network file server </li></ul></ul><ul><ul><li>Appl...
Multiple-Site Processing,  Multiple-Site Data (MPMD) <ul><li>Fully distributed DDBMS with support for multiple DPs and TPs...
Heterogeneous Distributed Database Scenario Figure 10.8
Distributed DB Transparency <ul><li>Allows end users to feel like only database user </li></ul><ul><li>Hides complexities ...
Distribution Transparency <ul><li>Allows management of a physically dispersed database as though it were centralized </li>...
Transaction Transparency <ul><li>Ensures transactions maintain integrity and consistency </li></ul><ul><li>Completed only ...
Remote Request Figure 10.10
Remote Transaction Figure 10.11
Distributed Transaction Figure 10.12
Distributed Requests Figure 10.13
Distributed Requests (con’t.) Figure 10.14
Distributed Concurrency Control <ul><li>Multisite, multiple-process operations more likely to create data inconsistencies ...
Two-Phase Commit Protocol <ul><li>DO-UNDO-REDO protocol </li></ul><ul><ul><li>Write-ahead protocol </li></ul></ul><ul><ul>...
Performance Transparency  and Query Optimization <ul><li>Objective: Minimize total cost associated with execution of reque...
Distributed Database Design <ul><li>Partition database into fragments </li></ul><ul><ul><li>Horizontal </li></ul></ul><ul>...
Client/Server Advantages Over DDBMS <ul><li>Client/server less expensive </li></ul><ul><li>Client/server solutions allow u...
Client/Server Disadvantages <ul><li>Creates more complex environment with different platforms </li></ul><ul><li>Increased ...
Upcoming SlideShare
Loading in …5
×

Lecture 10 distributed database management system

20,589
-1

Published on

Published in: Education
0 Comments
11 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
20,589
On Slideshare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
1,145
Comments
0
Likes
11
Embeds 0
No embeds

No notes for slide
  • 22
  • 22
  • 24
  • 26
  • 28
  • Lecture 10 distributed database management system

    1. 1. Lecture 10 Distributed Database Management Systems
    2. 2. Evolution of DDBMS <ul><li>Decentralized database management systems (DDBMS) </li></ul><ul><ul><li>Interconnected computer systems </li></ul></ul><ul><ul><li>Data/processing functions reside on multiple sites </li></ul></ul><ul><li>1970’s: Centralized DBMS </li></ul><ul><li>1980’s: Social and Technical Changes </li></ul><ul><ul><li>Ad hoc capability required </li></ul></ul><ul><ul><li>Decentralized management structure common </li></ul></ul><ul><li>1990’s: New forces </li></ul><ul><ul><li>Internet and the World Wide Web used for data access and distribution </li></ul></ul><ul><ul><li>Data analysis through data mining and data warehousing </li></ul></ul>
    3. 3. DDBMS Advantages <ul><li>Data located near site with greatest demand </li></ul><ul><li>Faster data access </li></ul><ul><li>Faster data processing </li></ul><ul><li>Growth facilitation </li></ul><ul><li>Improved communications </li></ul><ul><li>Reduced operating costs </li></ul><ul><li>User-friendly interface </li></ul><ul><li>Less danger of single-point failure </li></ul><ul><li>Processor independence </li></ul>
    4. 4. DDBMS Disadvantages <ul><li>Complexity of management and control </li></ul><ul><li>Security </li></ul><ul><li>Lack of standards </li></ul><ul><li>Increased storage requirements </li></ul><ul><li>Greater difficulty in managing data environment </li></ul><ul><li>Increased training costs </li></ul>
    5. 5. Distributed Processing <ul><li>Shares database’s logical processing among physically, networked independent sites </li></ul>Figure 10.1
    6. 6. Distributed Database <ul><li>Stores logically related database over physically independent sites </li></ul>Figure 10.2
    7. 7. Distributed Database vs. Distributed Processing <ul><li>Distributed processing </li></ul><ul><ul><li>Does not require distributed database </li></ul></ul><ul><ul><li>May be based on a single database on single computer </li></ul></ul><ul><ul><li>Copies or parts of database processing functions must be distributed to all data storage sites </li></ul></ul><ul><li>Distributed database </li></ul><ul><ul><li>Requires distributed processing </li></ul></ul><ul><li>Both </li></ul><ul><ul><li>Require a network to connect components </li></ul></ul>
    8. 8. Functions of DDBMS <ul><li>Application/end user interface </li></ul><ul><li>Validation to analyze data requests </li></ul><ul><li>Transformation to determine request components </li></ul><ul><li>Query optimization to find the best access strategy </li></ul><ul><li>Mapping to determine the data location </li></ul><ul><li>I/O interface to read or write data </li></ul><ul><li>Formatting to prepare the data for presentation </li></ul><ul><li>Security to provide data privacy </li></ul><ul><li>Backup and recovery </li></ul><ul><li>DB Administration </li></ul><ul><li>Concurrency Control </li></ul><ul><li>Transaction Management </li></ul>
    9. 9. Centralized Database Figure 10.3
    10. 10. Fully Distributed Database Management System Figure 10.4
    11. 11. DDBMS Components <ul><li>Computer workstations </li></ul><ul><li>Network hardware and software components </li></ul><ul><li>Communications media </li></ul><ul><li>Transaction processor (TP) </li></ul><ul><ul><li>Also called application manager (AP) or transaction manager (TM) </li></ul></ul><ul><li>Data processor (DP) </li></ul><ul><ul><li>Also called data manager (DM) </li></ul></ul>
    12. 12. Distributed Database Components Figure 10.5
    13. 13. DDBMS Protocols <ul><li>Interface with network to transport data and commands between DPs and TPs </li></ul><ul><li>Synchronize data received from DPs and route to appropriate TPs </li></ul><ul><li>Ensure common database functions </li></ul><ul><ul><li>Security </li></ul></ul><ul><ul><li>Concurrency control </li></ul></ul><ul><ul><li>Backup and recovery </li></ul></ul>
    14. 14. Levels of Data and Process Distribution <ul><li>Database systems can be classified based on process distribution and data distribution </li></ul>Table 10.1
    15. 15. Single-Site Processing, Single-Site Data (SPSD) <ul><li>All processing on single CPU or host computer </li></ul><ul><li>All data are stored on host computer disk </li></ul><ul><li>DBMS located on the host computer </li></ul><ul><li>DBMS accessed by dumb terminals </li></ul><ul><li>Typical of mainframe and minicomputer DBMSs </li></ul><ul><li>Typical of 1st generation of single-user microcomputer database </li></ul>
    16. 16. Single-Site Processing, Single-Site Data (con’t.) Figure 10.6
    17. 17. Multiple-Site Processing, Single-Site Data (MPSD) <ul><ul><li>Requires network file server </li></ul></ul><ul><ul><li>Applications accessed through LAN </li></ul></ul><ul><ul><li>Variation known as client/server architecture </li></ul></ul>Figure 10.7
    18. 18. Multiple-Site Processing, Multiple-Site Data (MPMD) <ul><li>Fully distributed DDBMS with support for multiple DPs and TPs at multiple sites </li></ul><ul><ul><li>Homogeneous I </li></ul></ul><ul><ul><ul><li>Integrate one type of centralized DBMS over the network </li></ul></ul></ul><ul><ul><li>Heterogeneous </li></ul></ul><ul><ul><ul><li>Integrate different types of centralized DBMSs over a network </li></ul></ul></ul>
    19. 19. Heterogeneous Distributed Database Scenario Figure 10.8
    20. 20. Distributed DB Transparency <ul><li>Allows end users to feel like only database user </li></ul><ul><li>Hides complexities of distributed database </li></ul><ul><li>Transparency features </li></ul><ul><ul><li>Distribution </li></ul></ul><ul><ul><li>Transaction </li></ul></ul><ul><ul><li>Failure </li></ul></ul><ul><ul><li>Performance </li></ul></ul><ul><ul><li>Heterogeneity </li></ul></ul>
    21. 21. Distribution Transparency <ul><li>Allows management of a physically dispersed database as though it were centralized </li></ul><ul><li>Three Levels </li></ul><ul><ul><li>Fragmentation transparency </li></ul></ul><ul><ul><li>Location transparency </li></ul></ul><ul><ul><li>Local mapping transparency </li></ul></ul>Table 10.2
    22. 22. Transaction Transparency <ul><li>Ensures transactions maintain integrity and consistency </li></ul><ul><li>Completed only if all involved database sites complete their part of the transaction </li></ul><ul><li>Management mechanisms </li></ul><ul><ul><li>Remote request </li></ul></ul><ul><ul><li>Remote transaction </li></ul></ul><ul><ul><li>Distributed transaction </li></ul></ul><ul><ul><li>Distributed request </li></ul></ul>
    23. 23. Remote Request Figure 10.10
    24. 24. Remote Transaction Figure 10.11
    25. 25. Distributed Transaction Figure 10.12
    26. 26. Distributed Requests Figure 10.13
    27. 27. Distributed Requests (con’t.) Figure 10.14
    28. 28. Distributed Concurrency Control <ul><li>Multisite, multiple-process operations more likely to create data inconsistencies and deadlocked transactions </li></ul><ul><li>Problems </li></ul><ul><ul><li>Transaction committed by local DP </li></ul></ul><ul><ul><li>One DP could not commit transaction’s result </li></ul></ul><ul><ul><li>Yields inconsistent database </li></ul></ul>
    29. 29. Two-Phase Commit Protocol <ul><li>DO-UNDO-REDO protocol </li></ul><ul><ul><li>Write-ahead protocol </li></ul></ul><ul><ul><li>Two kinds of nodes </li></ul></ul><ul><ul><ul><li>Coordinator </li></ul></ul></ul><ul><ul><ul><li>Subordinates </li></ul></ul></ul><ul><li>Phases </li></ul><ul><ul><li>Preparation </li></ul></ul><ul><ul><ul><li>Coordinator sends message to all subordinates </li></ul></ul></ul><ul><ul><ul><li>Confirms all are ready to commit or abort </li></ul></ul></ul><ul><ul><li>Final Commit </li></ul></ul><ul><ul><ul><li>Ensures all subordinates have committed or aborted </li></ul></ul></ul>
    30. 30. Performance Transparency and Query Optimization <ul><li>Objective: Minimize total cost associated with execution of request </li></ul><ul><li>Main costs </li></ul><ul><ul><li>Access time </li></ul></ul><ul><ul><li>Communication </li></ul></ul><ul><ul><li>CPU time </li></ul></ul><ul><li>Basis for query optimization algorithms </li></ul><ul><ul><li>Optimum execution order </li></ul></ul><ul><ul><li>Sites accessed to minimize communication costs </li></ul></ul><ul><li>Dynamic or static optimization </li></ul><ul><li>Statistically based vs. rule-based query optimization algorithms </li></ul>
    31. 31. Distributed Database Design <ul><li>Partition database into fragments </li></ul><ul><ul><li>Horizontal </li></ul></ul><ul><ul><li>Vertical </li></ul></ul><ul><ul><li>Mixed </li></ul></ul><ul><li>Fragments to replicate </li></ul><ul><ul><li>Storage of data copies at multiple sites </li></ul></ul><ul><ul><li>Fully, partially, unreplicated databases </li></ul></ul><ul><li>Data allocation </li></ul><ul><ul><li>Where to locate data </li></ul></ul><ul><ul><li>Centralized, partitioned, replicated </li></ul></ul>
    32. 32. Client/Server Advantages Over DDBMS <ul><li>Client/server less expensive </li></ul><ul><li>Client/server solutions allow use of microcomputer’s GUI </li></ul><ul><li>More people with PC skills than mainframe skills </li></ul><ul><li>PC is well established in workplace </li></ul><ul><li>Numerous data analysis and query tools exist </li></ul><ul><li>Considerable cost advantages to off-loading application development </li></ul>
    33. 33. Client/Server Disadvantages <ul><li>Creates more complex environment with different platforms </li></ul><ul><li>Increased number of users and sites creates security problems </li></ul><ul><li>Training issues become more complex and expensive </li></ul>
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×