SlideShare a Scribd company logo
1 of 72
Download to read offline
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Database Principles
Fundamentals of Design,
Implementation, and Management
Carlos Coronel, Steven Morris,
Keeley Crockett, Craig Blewett
CHAPTER 14
DISTRIBUTED DATABASES
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
In this chapter, you will learn:
• What a distributed database management system
(DDBMS) is and what its components are
• How database implementation is affected by different
levels of data and process distribution
• How transactions are managed in a distributed database
environment
• How database design is affected by the distributed
database environment
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
The Evolution of Distributed Database
Management Systems
• Distributed database management system
(DDBMS)
– Governs storage and processing of logically related
data over interconnected computer systems in
which both data and processing functions are
distributed among several sites
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
The Evolution of Distributed Database
Management Systems (continued)
• Centralised database required that corporate
data be stored in a single central site
• Dynamic business environment and
centralised database’s shortcomings spawned
a demand for applications based on data
access from different sources at multiple
locations
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
The Evolution of Distributed
Database Management Systems
(continued)
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
DDBMS Advantages and Disadvantages
• Advantages include:
– Data are located near “greatest demand” site
– Faster data access
– Faster data processing
– Growth facilitation
– Improved communications
INF3703-summary part2
Database II 100% (4)
Exam June 2019, questions
Database II 89% (9)
INF3703 Chapter 15 Databases For Business Intelligence Colonel 3e 2021 Textbook Slides
Database II 100% (2)
3
5
112
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
DDBMS Advantages and Disadvantages
(continued)
• Advantages include (continued):
– Reduced operating costs
– User-friendly interface
– Less danger of a single-point failure
– Processor independence
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
DDBMS Advantages and Disadvantages
(continued)
• Disadvantages include:
– Complexity of management and control
– Security
– Lack of standards
– Increased storage requirements
– Increased training cost
– Higher Costs
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Distributed Processing and Distributed
Databases
DDBMS Advantages and Disadvantages
(continued)
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Characteristics of Distributed
Management Systems
• Application interface
• Validation
• Transformation
• Query optimisation
• Mapping
• I/O interface
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Characteristics of Distributed
Management Systems (continued)
• Formatting
• Security
• Backup and recovery
• DB administration
• Concurrency control
• Transaction management
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Characteristics of Distributed
Management Systems (continued)
• Must perform all the functions of centralised
DBMS
• Must handle all necessary functions imposed
by distribution of data and processing
– Must perform these additional functions
transparently to the end user
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Characteristics of Distributed
Management Systems (continued)
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
DDBMS Components
• Must include (at least) the following
components:
– Computer workstations
– Network hardware and software
– Communications media
– Transaction processor (application processor,
transaction manager)
• Software component found in each computer that
requests data
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
DDBMS Components (continued)
• Must include (at least) the following
components (continued):
– Data processor (DP)
– Software component residing on each computer
that stores and retrieves data located at the site
• May be a centralised DBMS
DDBMS Components (continued)
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Levels of Data and Process
Distribution
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Single-Site Processing, Single-Site
Data (SPSD)
• In the single-site processing, single-site data
(SPSD) scenario, all processing is done on a
single host computer and all data are stored
on the host computer’s local disk.
• The DBMS is located on the host computer,
which is accessed by dumb terminals.
• TP and the DP are embedded within the
DBMS.
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Single-Site Processing, Single-Site Data
(SPSD) (continued)
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Multiple-Site Processing,
Single-Site Data (MPSD)
• Multiple processes run on different computers
sharing single data repository
• MPSD scenario requires a network file server
running conventional applications that are
accessed through a network.
• Many multiuser accounting applications,
running under personal computer network, fit
such a description
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Multiple-Site Processing, Single-Site Data
(MPSD) (continued)
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Multiple-Site Processing, Multiple-Site
Data (MPMD)
• Fully distributed database management
system with support for multiple data
processors and transaction processors at
multiple sites
• Classified as either homogeneous or
heterogeneous
• Homogeneous DDBMSs
– Integrate only one type of centralised DBMS over
a network
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Multiple-Site Processing, Multiple-Site
Data (MPMD) (continued)
• Heterogeneous DDBMSs
– Integrate different types of centralised DBMSs
over a network
• Fully heterogeneous DDBMS
– Support different DBMSs that may even support
different data models (relational, hierarchical, or
network) running under different computer
systems, such as mainframes and microcomputers
Multiple-Site Processing,
Multiple-Site Data (MPMD) (continued)
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Distributed Database Transparency
Features
• Allow end user to feel like database’s only user
• Features include:
– Distribution transparency
– Transaction transparency
– Failure transparency
– Performance transparency
– Heterogeneity transparency
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Distribution Transparency
• Allows management of physically dispersed
database as though it were a centralised
database
• Following three levels of distribution
transparency are recognised:
– Fragmentation transparency
– Location transparency
– Local mapping transparency
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Distribution Transparency (continued)
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Distribution Transparency (continued)
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Transaction Transparency
• Ensures database transactions will maintain
distributed database’s integrity and
consistency
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Distributed Requests and
Distributed Transactions
• Distributed transaction
– Can update or request data from several different
remote sites on network
• Remote request
– Lets single SQL statement access data to be
processed by single remote database processor
• Remote transaction
– Accesses data at single remote site
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Distributed Requests and
Distributed Transactions (continued)
• Distributed transaction
– Allows transaction to reference several different
(local or remote) DP sites
• Distributed request
– Lets single SQL statement reference data located
at several different local or remote DP sites
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Distributed Requests and Distributed
Transactions (continued)
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Distributed Requests and Distributed
Transactions (continued)
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Distributed Requests and Distributed
Transactions (continued)
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Distributed Requests and Distributed
Transactions (continued)
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Distributed Requests and Distributed
Transactions (continued)
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Distributed Concurrency Control
• Multisite, multiple-process operations are
much more likely to create data
inconsistencies and deadlocked transactions
than are single-site systems
Distributed Concurrency Control
(continued)
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Two-Phase Commit Protocol
• Distributed databases make it possible for
transaction to access data at several sites
• Final COMMIT must not be issued until all
sites have committed their parts of
transaction
• Two-phase commit protocol requires each
individual DP’s transaction log entry be
written before database fragment is actually
updated
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Two-Phase Commit Protocol
• The DO-UNDO-REDO protocol is used by the DP
to roll back and/or roll forward transactions with
the help of the system’s transaction log entries.
• The DO-UNDO-REDO protocol defines three types
of operations:
– DO performs the operation and records the ‘before’
and ‘after’ values in the transaction log.
– UNDO reverses an operation, using the log entries
written by the DO portion of the sequence.
– REDO redoes an operation, using the log entries
written by the DO portion of the sequence.
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Performance and Failure Transparency
• Objective of query optimisation routine is to
minimise total cost associated with execution
of request
• Costs associated with request are function of:
– Access time (I/O) cost
– Communication cost
– CPU time cost
• Must provide distribution transparency as well
as replica transparency
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Performance and Failure Transparency
• Resolving data requests in a distributed
data environment must take the following
points into consideration:
– Data distribution.
– Data Replication
– Network and node availability.
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Distributed Database Design
• Data fragmentation
– How to partition database into fragments
• Data replication
– Which fragments to replicate
• Data allocation
– Where to locate those fragments and replicas
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Data Fragmentation
• Breaks single object into two or more
segments or fragments
• Each fragment can be stored at any site over
computer network
• Information about data fragmentation is
stored in distributed data catalog (DDC), from
which it is accessed by TP to process user
requests
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Data Fragmentation (continued)
• Strategies
– Horizontal fragmentation
• Division of a relation into subsets (fragments) of tuples
(rows)
– Vertical fragmentation
• Division of a relation into attribute (column) subsets
– Mixed fragmentation
• Combination of horizontal and vertical strategies
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Data Fragmentation (continued)
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Data Fragmentation (continued)
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Data Fragmentation (continued)
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Data Fragmentation (continued)
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Data Fragmentation (continued)
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Data Fragmentation (continued)
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Data Fragmentation (continued)
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Data Fragmentation (continued)
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Data Replication
• Storage of data copies at multiple sites served
by computer network
• Fragment copies can be stored at several sites
to serve specific information requirements
– Can enhance data availability and response time
– Can help to reduce communication and total
query costs
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Data Replication (continued)
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Data Replication (continued)
• Replication scenarios
– Fully replicated database
• Stores multiple copies of each database fragment at
multiple sites
• Can be impractical due to amount of overhead
– Partially replicated database
• Stores multiple copies of some database fragments at
multiple sites
• Most DDBMSs are able to handle the partially
replicated database well
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Data Replication (continued)
• Replication scenarios (continued)
– Unreplicated database
• Stores each database fragment at single site
• No duplicate database fragments
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Data Allocation
• Deciding where to locate data
• Allocation strategies
– Centralised data allocation
• Entire database is stored at one site
– Partitioned data allocation
• Database is divided into several disjointed parts
(fragments) and stored at several sites
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Data Allocation (continued)
• Allocation strategies (continued)
– Replicated data allocation
• Copies of one or more database fragments are stored
at several sites
• Data distribution over computer network is
achieved through data partition, data
replication, or combination of both
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
The CAP Theorem
• The initials CAP stand for the three desirable
properties
– Consistency. In a distributed database, consistency
takes a bigger role. All nodes should see the same data
at the same time, which means that the replicas
should be immediately updated.
– Availability. Simply speaking, a request is always
fulfilled by the system. No received request is ever lost.
If you are buying tickets online, you do not want the
system to stop in the middle of the operation.
– Partition tolerance. The system continues to operate
even in the event of a node failure.
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
The CAP Theorem(continued)
• This trade-off between consistency and
availability has generated a new type of
distributed data systems in which data are
basically available, soft state, eventually
consistent (BASE).
• BASE refers to a data consistency model in
which data changes are not immediate but
propagate slowly through the system until all
replicas are eventually consistent
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
The CAP Theorem(continued)
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Distributed databases within the cloud
• Cloud computing is a new style of delivering
applications, data and resources to users over the web.
It provides an alternative for organisations who do not
wish to provide their own information technology (IT)
infrastructure to host their own databases or software.
• A third party cloud provider uses a number of
interconnected and virtualised computers to supply a
range of IT services which are standardised.
• Each third party cloud provider will have its own
flexible pricing model for each service it provides.
• This is often called a service level agreement.
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Distributed Databases Within The
Cloud (continued)
• The main benefits to an organisation of using a cloud infrastructure are:
– Cost Effectiveness. As the third party cloud provider is likely to be
hosting services for many organisations, only one IT infrastructure is
required which reduces the cost to individual organisation.
– Latest Software. In order to remain competitive, most third party
cloud providers will ensure that their software is always the latest
version available to remain competitive.
– Scalable Architecture. If the data requirements of the organisation
expand, it is easy to increase the database capacity and/or change the
underlying data model.
– Mobile Access. Data and software within the cloud can be accessed
generally from anywhere.
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Distributed Databases Within The
Cloud (continued)
• Current NoSQL solutions include column stores and
document stores.
• Column stores are for large scale distributed systems
which store petrabytes of data across
• hundreds, if not thousands of servers.
– E.g. Google, for example, uses BigTable to store its
structured data for applications such as Google Earth.
• Document stores move away from storing data in
tables. Instead, each document is stored differently
depending upon its size and format. Document stores
are referred to as document-orientated databases.
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
C. J. Date’s Twelve Commandments for
Distributed Databases
• Local site independence
• Central site independence
• Failure independence
• Location transparency
• Fragmentation transparency
• Replication transparency
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
C. J. Date’s Twelve Commandments for
Distributed Databases (continued)
• Distributed query processing
• Distributed transaction processing
• Hardware independence
• Operating system independence
• Network independence
• Database independence
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Summary
• Distributed database stores logically related data in two or
more physically independent sites connected via computer
network
• Distributed processing is division of logical database
processing among two or more network nodes
• Distributed databases require distributed processing
• Main components of DDBMS are transaction processor and
data processor
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Summary (continued)
• Current database systems can be classified by
extent to which they support processing and
data distribution
• Homogeneous distributed database system
integrates only one particular type of DBMS
over computer network
• Heterogeneous distributed database system
integrates several different types of DBMSs
over computer network
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Summary (continued)
• DDBMS characteristics are best described as set of
transparencies
• Transaction is formed by one or more database requests
• Distributed concurrency control is required in network of
distributed databases
• Distributed DBMS evaluates every data request to find
optimum access path in distributed database
Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett
©2020 Cengage EMEA
Summary (continued)
• The design of distributed database must
consider fragmentation and replication of data
• Database can be replicated over several
different sites on computer network
• The CAP theorem states that a highly
distributed data system has some desirable
properties of consistency, availability, and
partition tolerance. However, a system can
only provide two of these properties at a time.

More Related Content

What's hot

Solving the DB2 LUW Administration Dilemma
Solving the DB2 LUW Administration DilemmaSolving the DB2 LUW Administration Dilemma
Solving the DB2 LUW Administration DilemmaRandy Goering
 
Citrix XenDesktop and XenApp 7.5 Architecture Deployment
Citrix XenDesktop and XenApp 7.5 Architecture DeploymentCitrix XenDesktop and XenApp 7.5 Architecture Deployment
Citrix XenDesktop and XenApp 7.5 Architecture DeploymentHuy Pham
 
DB2 for z/OS Bufferpool Tuning win by Divide and Conquer or Lose by Multiply ...
DB2 for z/OS Bufferpool Tuning win by Divide and Conquer or Lose by Multiply ...DB2 for z/OS Bufferpool Tuning win by Divide and Conquer or Lose by Multiply ...
DB2 for z/OS Bufferpool Tuning win by Divide and Conquer or Lose by Multiply ...John Campbell
 
IBM DB2 for z/OS Administration Basics
IBM DB2 for z/OS Administration BasicsIBM DB2 for z/OS Administration Basics
IBM DB2 for z/OS Administration BasicsIBM
 
Wk 1 - File organization.pptx
Wk 1 - File organization.pptxWk 1 - File organization.pptx
Wk 1 - File organization.pptxDORCASGABRIEL1
 
Data Flow Diagram
Data Flow DiagramData Flow Diagram
Data Flow Diagramnethisip13
 
DB2 Basic Commands - UDB
DB2 Basic Commands - UDBDB2 Basic Commands - UDB
DB2 Basic Commands - UDBSrinimf-Slides
 
20 DFSORT Tricks For Zos Users - Interview Questions
20 DFSORT Tricks For Zos Users - Interview Questions20 DFSORT Tricks For Zos Users - Interview Questions
20 DFSORT Tricks For Zos Users - Interview QuestionsSrinimf-Slides
 
Database backup & recovery
Database backup & recoveryDatabase backup & recovery
Database backup & recoveryMustafa Khan
 
Data Center Tiers : Tier 1, Tier 2, Tier 3 and Tier 4 data center tiers expla...
Data Center Tiers : Tier 1, Tier 2, Tier 3 and Tier 4 data center tiers expla...Data Center Tiers : Tier 1, Tier 2, Tier 3 and Tier 4 data center tiers expla...
Data Center Tiers : Tier 1, Tier 2, Tier 3 and Tier 4 data center tiers expla...Cloud Computing Wire
 
Best practices for DB2 for z/OS log based recovery
Best practices for DB2 for z/OS log based recoveryBest practices for DB2 for z/OS log based recovery
Best practices for DB2 for z/OS log based recoveryFlorence Dubois
 
Lecture 03 data abstraction and er model
Lecture 03 data abstraction and er modelLecture 03 data abstraction and er model
Lecture 03 data abstraction and er modelemailharmeet
 

What's hot (20)

Solving the DB2 LUW Administration Dilemma
Solving the DB2 LUW Administration DilemmaSolving the DB2 LUW Administration Dilemma
Solving the DB2 LUW Administration Dilemma
 
Raid
Raid Raid
Raid
 
Database storage engines
Database storage enginesDatabase storage engines
Database storage engines
 
Citrix XenDesktop and XenApp 7.5 Architecture Deployment
Citrix XenDesktop and XenApp 7.5 Architecture DeploymentCitrix XenDesktop and XenApp 7.5 Architecture Deployment
Citrix XenDesktop and XenApp 7.5 Architecture Deployment
 
DB2 for z/OS Bufferpool Tuning win by Divide and Conquer or Lose by Multiply ...
DB2 for z/OS Bufferpool Tuning win by Divide and Conquer or Lose by Multiply ...DB2 for z/OS Bufferpool Tuning win by Divide and Conquer or Lose by Multiply ...
DB2 for z/OS Bufferpool Tuning win by Divide and Conquer or Lose by Multiply ...
 
IBM DB2 for z/OS Administration Basics
IBM DB2 for z/OS Administration BasicsIBM DB2 for z/OS Administration Basics
IBM DB2 for z/OS Administration Basics
 
RAID Review
RAID ReviewRAID Review
RAID Review
 
Wk 1 - File organization.pptx
Wk 1 - File organization.pptxWk 1 - File organization.pptx
Wk 1 - File organization.pptx
 
Data Flow Diagram
Data Flow DiagramData Flow Diagram
Data Flow Diagram
 
DB2 Basic Commands - UDB
DB2 Basic Commands - UDBDB2 Basic Commands - UDB
DB2 Basic Commands - UDB
 
Domain Name System
Domain Name SystemDomain Name System
Domain Name System
 
20 DFSORT Tricks For Zos Users - Interview Questions
20 DFSORT Tricks For Zos Users - Interview Questions20 DFSORT Tricks For Zos Users - Interview Questions
20 DFSORT Tricks For Zos Users - Interview Questions
 
Database Security
Database SecurityDatabase Security
Database Security
 
DB2 TABLESPACES
DB2 TABLESPACESDB2 TABLESPACES
DB2 TABLESPACES
 
Database backup & recovery
Database backup & recoveryDatabase backup & recovery
Database backup & recovery
 
NetApp & Storage fundamentals
NetApp & Storage fundamentalsNetApp & Storage fundamentals
NetApp & Storage fundamentals
 
DBMS & RDBMS (PPT)
DBMS & RDBMS (PPT)DBMS & RDBMS (PPT)
DBMS & RDBMS (PPT)
 
Data Center Tiers : Tier 1, Tier 2, Tier 3 and Tier 4 data center tiers expla...
Data Center Tiers : Tier 1, Tier 2, Tier 3 and Tier 4 data center tiers expla...Data Center Tiers : Tier 1, Tier 2, Tier 3 and Tier 4 data center tiers expla...
Data Center Tiers : Tier 1, Tier 2, Tier 3 and Tier 4 data center tiers expla...
 
Best practices for DB2 for z/OS log based recovery
Best practices for DB2 for z/OS log based recoveryBest practices for DB2 for z/OS log based recovery
Best practices for DB2 for z/OS log based recovery
 
Lecture 03 data abstraction and er model
Lecture 03 data abstraction and er modelLecture 03 data abstraction and er model
Lecture 03 data abstraction and er model
 

Similar to INF3703 - Chapter 14 Distributed Databases

INF3703 - Chapter 12 Managing Transactions Concurrency
INF3703 - Chapter 12 Managing Transactions ConcurrencyINF3703 - Chapter 12 Managing Transactions Concurrency
INF3703 - Chapter 12 Managing Transactions Concurrencybloeyyy
 
INF3703 - Chapter 13 Managing Database SQL Performance
INF3703 - Chapter 13 Managing Database SQL PerformanceINF3703 - Chapter 13 Managing Database SQL Performance
INF3703 - Chapter 13 Managing Database SQL Performancebloeyyy
 
INF3703 - Chapter 10 Database Development Process
INF3703 - Chapter 10 Database Development ProcessINF3703 - Chapter 10 Database Development Process
INF3703 - Chapter 10 Database Development Processbloeyyy
 
Reinventing and Simplifying Data Management for a Successful Hybrid and Multi...
Reinventing and Simplifying Data Management for a Successful Hybrid and Multi...Reinventing and Simplifying Data Management for a Successful Hybrid and Multi...
Reinventing and Simplifying Data Management for a Successful Hybrid and Multi...Denodo
 
My seminar on distributed dbms
My seminar on distributed dbmsMy seminar on distributed dbms
My seminar on distributed dbmsVinay D. Patel
 
Govern and Protect Your End User Information
Govern and Protect Your End User InformationGovern and Protect Your End User Information
Govern and Protect Your End User InformationDenodo
 
INF3703 - Chapter 17 Database Connectivity Web Technologies
INF3703 - Chapter 17 Database Connectivity Web TechnologiesINF3703 - Chapter 17 Database Connectivity Web Technologies
INF3703 - Chapter 17 Database Connectivity Web Technologiesbloeyyy
 
1- Introduction for software engineering
1- Introduction for software engineering1- Introduction for software engineering
1- Introduction for software engineeringmouath1424
 
Unit - I DBMS.pptx
Unit - I DBMS.pptxUnit - I DBMS.pptx
Unit - I DBMS.pptxVarshini62
 
Cloud Migration headache? Ease the pain with Data Virtualization! (EMEA)
Cloud Migration headache? Ease the pain with Data Virtualization! (EMEA)Cloud Migration headache? Ease the pain with Data Virtualization! (EMEA)
Cloud Migration headache? Ease the pain with Data Virtualization! (EMEA)Denodo
 
The Growth Of Data Centers
The Growth Of Data CentersThe Growth Of Data Centers
The Growth Of Data CentersGina Buck
 
Distributed database management system
Distributed database management systemDistributed database management system
Distributed database management systemVinay D. Patel
 
Case Study: Synchroniztion Issues in Mobile Databases
Case Study: Synchroniztion Issues in Mobile DatabasesCase Study: Synchroniztion Issues in Mobile Databases
Case Study: Synchroniztion Issues in Mobile DatabasesG. Habib Uddin Khan
 
Case Study: Synchroniztion Issues in Mobile Databases
Case Study: Synchroniztion Issues in Mobile DatabasesCase Study: Synchroniztion Issues in Mobile Databases
Case Study: Synchroniztion Issues in Mobile DatabasesG. Habib Uddin Khan
 
Distributed Database management system .pptx
Distributed Database management system .pptxDistributed Database management system .pptx
Distributed Database management system .pptxbirhanugirmay559
 
Data Virtualization to Survive a Multi and Hybrid Cloud World
Data Virtualization to Survive a Multi and Hybrid Cloud WorldData Virtualization to Survive a Multi and Hybrid Cloud World
Data Virtualization to Survive a Multi and Hybrid Cloud WorldDenodo
 

Similar to INF3703 - Chapter 14 Distributed Databases (20)

INF3703 - Chapter 12 Managing Transactions Concurrency
INF3703 - Chapter 12 Managing Transactions ConcurrencyINF3703 - Chapter 12 Managing Transactions Concurrency
INF3703 - Chapter 12 Managing Transactions Concurrency
 
Distributed dbms (ddbms)
Distributed dbms (ddbms)Distributed dbms (ddbms)
Distributed dbms (ddbms)
 
INF3703 - Chapter 13 Managing Database SQL Performance
INF3703 - Chapter 13 Managing Database SQL PerformanceINF3703 - Chapter 13 Managing Database SQL Performance
INF3703 - Chapter 13 Managing Database SQL Performance
 
INF3703 - Chapter 10 Database Development Process
INF3703 - Chapter 10 Database Development ProcessINF3703 - Chapter 10 Database Development Process
INF3703 - Chapter 10 Database Development Process
 
Reinventing and Simplifying Data Management for a Successful Hybrid and Multi...
Reinventing and Simplifying Data Management for a Successful Hybrid and Multi...Reinventing and Simplifying Data Management for a Successful Hybrid and Multi...
Reinventing and Simplifying Data Management for a Successful Hybrid and Multi...
 
My seminar on distributed dbms
My seminar on distributed dbmsMy seminar on distributed dbms
My seminar on distributed dbms
 
Govern and Protect Your End User Information
Govern and Protect Your End User InformationGovern and Protect Your End User Information
Govern and Protect Your End User Information
 
ch12text.pdf
ch12text.pdfch12text.pdf
ch12text.pdf
 
Distributed databases
Distributed  databasesDistributed  databases
Distributed databases
 
INF3703 - Chapter 17 Database Connectivity Web Technologies
INF3703 - Chapter 17 Database Connectivity Web TechnologiesINF3703 - Chapter 17 Database Connectivity Web Technologies
INF3703 - Chapter 17 Database Connectivity Web Technologies
 
1- Introduction for software engineering
1- Introduction for software engineering1- Introduction for software engineering
1- Introduction for software engineering
 
Database Concepts.ppt
Database Concepts.pptDatabase Concepts.ppt
Database Concepts.ppt
 
Unit - I DBMS.pptx
Unit - I DBMS.pptxUnit - I DBMS.pptx
Unit - I DBMS.pptx
 
Cloud Migration headache? Ease the pain with Data Virtualization! (EMEA)
Cloud Migration headache? Ease the pain with Data Virtualization! (EMEA)Cloud Migration headache? Ease the pain with Data Virtualization! (EMEA)
Cloud Migration headache? Ease the pain with Data Virtualization! (EMEA)
 
The Growth Of Data Centers
The Growth Of Data CentersThe Growth Of Data Centers
The Growth Of Data Centers
 
Distributed database management system
Distributed database management systemDistributed database management system
Distributed database management system
 
Case Study: Synchroniztion Issues in Mobile Databases
Case Study: Synchroniztion Issues in Mobile DatabasesCase Study: Synchroniztion Issues in Mobile Databases
Case Study: Synchroniztion Issues in Mobile Databases
 
Case Study: Synchroniztion Issues in Mobile Databases
Case Study: Synchroniztion Issues in Mobile DatabasesCase Study: Synchroniztion Issues in Mobile Databases
Case Study: Synchroniztion Issues in Mobile Databases
 
Distributed Database management system .pptx
Distributed Database management system .pptxDistributed Database management system .pptx
Distributed Database management system .pptx
 
Data Virtualization to Survive a Multi and Hybrid Cloud World
Data Virtualization to Survive a Multi and Hybrid Cloud WorldData Virtualization to Survive a Multi and Hybrid Cloud World
Data Virtualization to Survive a Multi and Hybrid Cloud World
 

Recently uploaded

JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuidePixlogix Infotech
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceSamy Fodil
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...panagenda
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftshyamraj55
 
CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)Wonjun Hwang
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxFIDO Alliance
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard37
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...marcuskenyatta275
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe中 央社
 
Top 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTop 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTopCSSGallery
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!Memoori
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
How to Check GPS Location with a Live Tracker in Pakistan
How to Check GPS Location with a Live Tracker in PakistanHow to Check GPS Location with a Live Tracker in Pakistan
How to Check GPS Location with a Live Tracker in Pakistandanishmna97
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...ScyllaDB
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctBrainSell Technologies
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch TuesdayIvanti
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentationyogeshlabana357357
 
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptxCyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptxMasterG
 
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdfMuhammad Subhan
 

Recently uploaded (20)

JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate Guide
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptx
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
Top 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTop 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development Companies
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
How to Check GPS Location with a Live Tracker in Pakistan
How to Check GPS Location with a Live Tracker in PakistanHow to Check GPS Location with a Live Tracker in Pakistan
How to Check GPS Location with a Live Tracker in Pakistan
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch Tuesday
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptxCyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
 
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 

INF3703 - Chapter 14 Distributed Databases

  • 1. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Database Principles Fundamentals of Design, Implementation, and Management Carlos Coronel, Steven Morris, Keeley Crockett, Craig Blewett CHAPTER 14 DISTRIBUTED DATABASES
  • 2. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA In this chapter, you will learn: • What a distributed database management system (DDBMS) is and what its components are • How database implementation is affected by different levels of data and process distribution • How transactions are managed in a distributed database environment • How database design is affected by the distributed database environment
  • 3. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA The Evolution of Distributed Database Management Systems • Distributed database management system (DDBMS) – Governs storage and processing of logically related data over interconnected computer systems in which both data and processing functions are distributed among several sites
  • 4. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA The Evolution of Distributed Database Management Systems (continued) • Centralised database required that corporate data be stored in a single central site • Dynamic business environment and centralised database’s shortcomings spawned a demand for applications based on data access from different sources at multiple locations
  • 5. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA The Evolution of Distributed Database Management Systems (continued)
  • 6. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA DDBMS Advantages and Disadvantages • Advantages include: – Data are located near “greatest demand” site – Faster data access – Faster data processing – Growth facilitation – Improved communications
  • 7. INF3703-summary part2 Database II 100% (4) Exam June 2019, questions Database II 89% (9) INF3703 Chapter 15 Databases For Business Intelligence Colonel 3e 2021 Textbook Slides Database II 100% (2) 3 5 112 Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA DDBMS Advantages and Disadvantages (continued) • Advantages include (continued): – Reduced operating costs – User-friendly interface – Less danger of a single-point failure – Processor independence
  • 8. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA DDBMS Advantages and Disadvantages (continued) • Disadvantages include: – Complexity of management and control – Security – Lack of standards – Increased storage requirements – Increased training cost – Higher Costs
  • 9. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Distributed Processing and Distributed Databases
  • 10. DDBMS Advantages and Disadvantages (continued)
  • 11. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Characteristics of Distributed Management Systems • Application interface • Validation • Transformation • Query optimisation • Mapping • I/O interface
  • 12. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Characteristics of Distributed Management Systems (continued) • Formatting • Security • Backup and recovery • DB administration • Concurrency control • Transaction management
  • 13. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Characteristics of Distributed Management Systems (continued) • Must perform all the functions of centralised DBMS • Must handle all necessary functions imposed by distribution of data and processing – Must perform these additional functions transparently to the end user
  • 14. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Characteristics of Distributed Management Systems (continued)
  • 15. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA DDBMS Components • Must include (at least) the following components: – Computer workstations – Network hardware and software – Communications media – Transaction processor (application processor, transaction manager) • Software component found in each computer that requests data
  • 16. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA DDBMS Components (continued) • Must include (at least) the following components (continued): – Data processor (DP) – Software component residing on each computer that stores and retrieves data located at the site • May be a centralised DBMS
  • 18. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Levels of Data and Process Distribution
  • 19. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Single-Site Processing, Single-Site Data (SPSD) • In the single-site processing, single-site data (SPSD) scenario, all processing is done on a single host computer and all data are stored on the host computer’s local disk. • The DBMS is located on the host computer, which is accessed by dumb terminals. • TP and the DP are embedded within the DBMS.
  • 20. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Single-Site Processing, Single-Site Data (SPSD) (continued)
  • 21. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Multiple-Site Processing, Single-Site Data (MPSD) • Multiple processes run on different computers sharing single data repository • MPSD scenario requires a network file server running conventional applications that are accessed through a network. • Many multiuser accounting applications, running under personal computer network, fit such a description
  • 22. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Multiple-Site Processing, Single-Site Data (MPSD) (continued)
  • 23. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Multiple-Site Processing, Multiple-Site Data (MPMD) • Fully distributed database management system with support for multiple data processors and transaction processors at multiple sites • Classified as either homogeneous or heterogeneous • Homogeneous DDBMSs – Integrate only one type of centralised DBMS over a network
  • 24. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Multiple-Site Processing, Multiple-Site Data (MPMD) (continued) • Heterogeneous DDBMSs – Integrate different types of centralised DBMSs over a network • Fully heterogeneous DDBMS – Support different DBMSs that may even support different data models (relational, hierarchical, or network) running under different computer systems, such as mainframes and microcomputers
  • 26. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Distributed Database Transparency Features • Allow end user to feel like database’s only user • Features include: – Distribution transparency – Transaction transparency – Failure transparency – Performance transparency – Heterogeneity transparency
  • 27. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Distribution Transparency • Allows management of physically dispersed database as though it were a centralised database • Following three levels of distribution transparency are recognised: – Fragmentation transparency – Location transparency – Local mapping transparency
  • 28. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Distribution Transparency (continued)
  • 29. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Distribution Transparency (continued)
  • 30. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Transaction Transparency • Ensures database transactions will maintain distributed database’s integrity and consistency
  • 31. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Distributed Requests and Distributed Transactions • Distributed transaction – Can update or request data from several different remote sites on network • Remote request – Lets single SQL statement access data to be processed by single remote database processor • Remote transaction – Accesses data at single remote site
  • 32. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Distributed Requests and Distributed Transactions (continued) • Distributed transaction – Allows transaction to reference several different (local or remote) DP sites • Distributed request – Lets single SQL statement reference data located at several different local or remote DP sites
  • 33. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Distributed Requests and Distributed Transactions (continued)
  • 34. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Distributed Requests and Distributed Transactions (continued)
  • 35. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Distributed Requests and Distributed Transactions (continued)
  • 36. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Distributed Requests and Distributed Transactions (continued)
  • 37. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Distributed Requests and Distributed Transactions (continued)
  • 38. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Distributed Concurrency Control • Multisite, multiple-process operations are much more likely to create data inconsistencies and deadlocked transactions than are single-site systems
  • 40. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Two-Phase Commit Protocol • Distributed databases make it possible for transaction to access data at several sites • Final COMMIT must not be issued until all sites have committed their parts of transaction • Two-phase commit protocol requires each individual DP’s transaction log entry be written before database fragment is actually updated
  • 41. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Two-Phase Commit Protocol • The DO-UNDO-REDO protocol is used by the DP to roll back and/or roll forward transactions with the help of the system’s transaction log entries. • The DO-UNDO-REDO protocol defines three types of operations: – DO performs the operation and records the ‘before’ and ‘after’ values in the transaction log. – UNDO reverses an operation, using the log entries written by the DO portion of the sequence. – REDO redoes an operation, using the log entries written by the DO portion of the sequence.
  • 42. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Performance and Failure Transparency • Objective of query optimisation routine is to minimise total cost associated with execution of request • Costs associated with request are function of: – Access time (I/O) cost – Communication cost – CPU time cost • Must provide distribution transparency as well as replica transparency
  • 43. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Performance and Failure Transparency • Resolving data requests in a distributed data environment must take the following points into consideration: – Data distribution. – Data Replication – Network and node availability.
  • 44. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Distributed Database Design • Data fragmentation – How to partition database into fragments • Data replication – Which fragments to replicate • Data allocation – Where to locate those fragments and replicas
  • 45. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Data Fragmentation • Breaks single object into two or more segments or fragments • Each fragment can be stored at any site over computer network • Information about data fragmentation is stored in distributed data catalog (DDC), from which it is accessed by TP to process user requests
  • 46. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Data Fragmentation (continued) • Strategies – Horizontal fragmentation • Division of a relation into subsets (fragments) of tuples (rows) – Vertical fragmentation • Division of a relation into attribute (column) subsets – Mixed fragmentation • Combination of horizontal and vertical strategies
  • 47. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Data Fragmentation (continued)
  • 48. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Data Fragmentation (continued)
  • 49. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Data Fragmentation (continued)
  • 50. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Data Fragmentation (continued)
  • 51. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Data Fragmentation (continued)
  • 52. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Data Fragmentation (continued)
  • 53. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Data Fragmentation (continued)
  • 54. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Data Fragmentation (continued)
  • 55. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Data Replication • Storage of data copies at multiple sites served by computer network • Fragment copies can be stored at several sites to serve specific information requirements – Can enhance data availability and response time – Can help to reduce communication and total query costs
  • 56. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Data Replication (continued)
  • 57. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Data Replication (continued) • Replication scenarios – Fully replicated database • Stores multiple copies of each database fragment at multiple sites • Can be impractical due to amount of overhead – Partially replicated database • Stores multiple copies of some database fragments at multiple sites • Most DDBMSs are able to handle the partially replicated database well
  • 58. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Data Replication (continued) • Replication scenarios (continued) – Unreplicated database • Stores each database fragment at single site • No duplicate database fragments
  • 59. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Data Allocation • Deciding where to locate data • Allocation strategies – Centralised data allocation • Entire database is stored at one site – Partitioned data allocation • Database is divided into several disjointed parts (fragments) and stored at several sites
  • 60. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Data Allocation (continued) • Allocation strategies (continued) – Replicated data allocation • Copies of one or more database fragments are stored at several sites • Data distribution over computer network is achieved through data partition, data replication, or combination of both
  • 61. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA The CAP Theorem • The initials CAP stand for the three desirable properties – Consistency. In a distributed database, consistency takes a bigger role. All nodes should see the same data at the same time, which means that the replicas should be immediately updated. – Availability. Simply speaking, a request is always fulfilled by the system. No received request is ever lost. If you are buying tickets online, you do not want the system to stop in the middle of the operation. – Partition tolerance. The system continues to operate even in the event of a node failure.
  • 62. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA The CAP Theorem(continued) • This trade-off between consistency and availability has generated a new type of distributed data systems in which data are basically available, soft state, eventually consistent (BASE). • BASE refers to a data consistency model in which data changes are not immediate but propagate slowly through the system until all replicas are eventually consistent
  • 63. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA The CAP Theorem(continued)
  • 64. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Distributed databases within the cloud • Cloud computing is a new style of delivering applications, data and resources to users over the web. It provides an alternative for organisations who do not wish to provide their own information technology (IT) infrastructure to host their own databases or software. • A third party cloud provider uses a number of interconnected and virtualised computers to supply a range of IT services which are standardised. • Each third party cloud provider will have its own flexible pricing model for each service it provides. • This is often called a service level agreement.
  • 65. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Distributed Databases Within The Cloud (continued) • The main benefits to an organisation of using a cloud infrastructure are: – Cost Effectiveness. As the third party cloud provider is likely to be hosting services for many organisations, only one IT infrastructure is required which reduces the cost to individual organisation. – Latest Software. In order to remain competitive, most third party cloud providers will ensure that their software is always the latest version available to remain competitive. – Scalable Architecture. If the data requirements of the organisation expand, it is easy to increase the database capacity and/or change the underlying data model. – Mobile Access. Data and software within the cloud can be accessed generally from anywhere.
  • 66. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Distributed Databases Within The Cloud (continued) • Current NoSQL solutions include column stores and document stores. • Column stores are for large scale distributed systems which store petrabytes of data across • hundreds, if not thousands of servers. – E.g. Google, for example, uses BigTable to store its structured data for applications such as Google Earth. • Document stores move away from storing data in tables. Instead, each document is stored differently depending upon its size and format. Document stores are referred to as document-orientated databases.
  • 67. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA C. J. Date’s Twelve Commandments for Distributed Databases • Local site independence • Central site independence • Failure independence • Location transparency • Fragmentation transparency • Replication transparency
  • 68. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA C. J. Date’s Twelve Commandments for Distributed Databases (continued) • Distributed query processing • Distributed transaction processing • Hardware independence • Operating system independence • Network independence • Database independence
  • 69. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Summary • Distributed database stores logically related data in two or more physically independent sites connected via computer network • Distributed processing is division of logical database processing among two or more network nodes • Distributed databases require distributed processing • Main components of DDBMS are transaction processor and data processor
  • 70. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Summary (continued) • Current database systems can be classified by extent to which they support processing and data distribution • Homogeneous distributed database system integrates only one particular type of DBMS over computer network • Heterogeneous distributed database system integrates several different types of DBMSs over computer network
  • 71. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Summary (continued) • DDBMS characteristics are best described as set of transparencies • Transaction is formed by one or more database requests • Distributed concurrency control is required in network of distributed databases • Distributed DBMS evaluates every data request to find optimum access path in distributed database
  • 72. Database Principles 3rd Ed., Coronel, Morris, Crockett, Blewett ©2020 Cengage EMEA Summary (continued) • The design of distributed database must consider fragmentation and replication of data • Database can be replicated over several different sites on computer network • The CAP theorem states that a highly distributed data system has some desirable properties of consistency, availability, and partition tolerance. However, a system can only provide two of these properties at a time.