DISTRIBUTED
DBMS
B Y PAT E L V I N AY K U M A R D I N E S H C H A N D R A
CONTENT
• Abstract
• Introduction
• Definition
• Architecture
• How it works ?
• Its types
• Characteristics
• Functions
• Advantages
• Disadvantages
ABSTRACT
• The purpose of this presentation is to present an introduction to distributed database
management system (DDBMS).
• We discuss Distributed DBMS, Its architecture, Its design, Its Types, Its Properties, Its
Functionality etc.
INTRODUCTION
• In today’s world of universal dependence on information systems, all sorts of people need access to
companies’ databases. In addition to a company’s own employees, these include the company’s
customers, potential customers, suppliers, and vendors of all types. It is possible for a company to have
all of its databases concentrated at one mainframe computer site with worldwide access to this site
provided by telecommunications networks, including the Internet.
• Although the management of such a centralized system and its databases can be controlled in a well-
contained manner and this can be advantageous, it poses some problems as well. For example, if the
single site goes down, then everyone is blocked from accessing the databases until the site comes
back up again. Also the communications costs from the many far PCs and terminals to the central site
can be expensive.
• One solution to such problems, and an alternative design to the centralized database concept, is known
as ‘Distributed Database’. The idea is that instead of having one, centralized database, we are going to
spread the data out among the cities on the distributed network, each of which has its own computer
and data storage facilities. All of this distributed data is still considered to be a single logical database.
• When a person or process anywhere on the distributed network queries the database, it is not
necessary to know where on the network the data being sought is located. The user just issues the
query, and the result is returned. This feature is known as ‘Location Transparency’. This can become
rather complex very quickly, and it must be managed by sophisticated software known as a ‘Distributed
DEFINITION
• A distributed database (DDB) is a collection of multiple, logically interrelated databases
distributed over a computer network.
• A ‘Distributed Database Management System’ (Distributed DBMS) is the software that
manages the DDB, and provides an access mechanism that makes this distribution
transparent to the user.
• Distributed Database System (DDBS) is the integration of Distributed DB and Distributed
DBMS.
• This integration is achieved through the merging the database and networking
technologies together or it can be described as, a system that runs on a collection of
machines that do not have shared memory, yet looks to the user like a single machine.
ARCHITECTURE OF DISTRIBUTED DBMS
ARCHITECTURE OF DISTRIBUTED DBMS
• Each computer (site) in a distributed system may contain a Transaction Manager (TM)
and a Data Manager (DM) - as we will see later, there is also a Transaction Coordinator
(TC). The TM is responsible for the Transactions received by the computer. The DM
manages the database access on the local computer.
• When a Transaction arrives at the TM, the TM divides the transaction into sub
transactions which are transmitted to those DMs containing the data needed by the
Transaction. (In some cases the TC is responsible for this.)
• The TM processes the collected received data from the sub-transactions' responses and
produces the final result.
• Any TM can communicate with all DMs.
HOW DISTRIBUTED DBMS WORKS ?
TYPES OF DISTRIBUTED DBMS
1. Homogeneous Distributed DBMS
2. Heterogeneous Distributed DBMS
1 . H O M O G E N E O U S
D I S T R I B U T E D D B M S
This is the case when the
application programs are
independent of how the
database is distributed; i.e. if
the distribution of the physical
data can be altered without
having to make alterations to
the application programs.
Here, all sites use the same
DBMS product - same
schemata and same data
dictionaries.
2 . H E T E R O G E N E O U S
D I S T R I B U T E D D B M S
This is the case when the
application programs are
dependent on the physical
location of the stored data; i.e.
application programs must be
altered if data is moved from
one site to another. Here, there
are different kinds of DBMSs
(i.e. Hierarchical, Network,
Relational, Object., etc.), with
different underlying data
models.
CHARACTERISTICS OF DISTRIBUTED DBMS
- A Distributed DBMS developed by a single vendor may contain:
• Data Independence
• Concurrency Control
• Replication facilities
• Recovery facilities
FUNCTIONS OF DISTRIBUTED DBMS
• A DDBMS governs the storage and processing of logically related data over
interconnected computer systems in which both data and processing functions are
distributed among several sites. A DBMS must have at least the following functions to be
classified as distributed :
1. Application interface to interact with the end user, application programs, and other
DBMSs within the distributed database.
2. Validation to analyse data requests for syntax correctness.
3. Transformation to decompose complex requests into atomic data request
components.
4. Query optimization to find the best access strategy. (Which database fragments
must be accessed by the query, and how must data updates, if any, be
synchronized?)
5. Mapping to determine the data location of local and remote fragments.
ADVANTAGES
• Data are located near the greatest demand site :- The data in a distributed database
system are dispersed to match business requirements which reduce the cost of data
access.
• Faster data access :- End users often work with only a locally stored subset of the
company’s data.
• Faster data processing:- A distributed database system spreads out the systems
workload by processing data at several sites.
• Growth facilitation :- New sites can be added to the network without affecting the
operations of other sites.
• Improved communications :- Because local sites are smaller and located closer to
customers, local sites foster better communication among departments and between
customers and company staff.
DISADVANTAGES
• Complexity of management and control :- Applications must recognize data location,
and they must be able to stitch together data from various sites. Database administrators
must have the ability to coordinate database activities to prevent database degradation
due to data anomalies.
• Technological difficulty :- Data integrity, transaction management, concurrency
control, security, backup, recovery, query optimization, access path selection, and so on,
must all be addressed and resolved.
• Security :- The probability of security lapses increases when data are located at
multiple sites. The responsibility of data management will be shared by different people
at several sites.
• Lack of standards :- There are no standard communication protocols at the database
level. For example, different database vendors employ different—and often
incompatible—techniques to manage the distribution of data and processing in a
DDBMS environment.
Distributed database management system

Distributed database management system

  • 1.
    DISTRIBUTED DBMS B Y PATE L V I N AY K U M A R D I N E S H C H A N D R A
  • 2.
    CONTENT • Abstract • Introduction •Definition • Architecture • How it works ? • Its types • Characteristics • Functions • Advantages • Disadvantages
  • 3.
    ABSTRACT • The purposeof this presentation is to present an introduction to distributed database management system (DDBMS). • We discuss Distributed DBMS, Its architecture, Its design, Its Types, Its Properties, Its Functionality etc.
  • 4.
    INTRODUCTION • In today’sworld of universal dependence on information systems, all sorts of people need access to companies’ databases. In addition to a company’s own employees, these include the company’s customers, potential customers, suppliers, and vendors of all types. It is possible for a company to have all of its databases concentrated at one mainframe computer site with worldwide access to this site provided by telecommunications networks, including the Internet. • Although the management of such a centralized system and its databases can be controlled in a well- contained manner and this can be advantageous, it poses some problems as well. For example, if the single site goes down, then everyone is blocked from accessing the databases until the site comes back up again. Also the communications costs from the many far PCs and terminals to the central site can be expensive. • One solution to such problems, and an alternative design to the centralized database concept, is known as ‘Distributed Database’. The idea is that instead of having one, centralized database, we are going to spread the data out among the cities on the distributed network, each of which has its own computer and data storage facilities. All of this distributed data is still considered to be a single logical database. • When a person or process anywhere on the distributed network queries the database, it is not necessary to know where on the network the data being sought is located. The user just issues the query, and the result is returned. This feature is known as ‘Location Transparency’. This can become rather complex very quickly, and it must be managed by sophisticated software known as a ‘Distributed
  • 5.
    DEFINITION • A distributeddatabase (DDB) is a collection of multiple, logically interrelated databases distributed over a computer network. • A ‘Distributed Database Management System’ (Distributed DBMS) is the software that manages the DDB, and provides an access mechanism that makes this distribution transparent to the user. • Distributed Database System (DDBS) is the integration of Distributed DB and Distributed DBMS. • This integration is achieved through the merging the database and networking technologies together or it can be described as, a system that runs on a collection of machines that do not have shared memory, yet looks to the user like a single machine.
  • 6.
  • 7.
    ARCHITECTURE OF DISTRIBUTEDDBMS • Each computer (site) in a distributed system may contain a Transaction Manager (TM) and a Data Manager (DM) - as we will see later, there is also a Transaction Coordinator (TC). The TM is responsible for the Transactions received by the computer. The DM manages the database access on the local computer. • When a Transaction arrives at the TM, the TM divides the transaction into sub transactions which are transmitted to those DMs containing the data needed by the Transaction. (In some cases the TC is responsible for this.) • The TM processes the collected received data from the sub-transactions' responses and produces the final result. • Any TM can communicate with all DMs.
  • 8.
  • 9.
    TYPES OF DISTRIBUTEDDBMS 1. Homogeneous Distributed DBMS 2. Heterogeneous Distributed DBMS
  • 10.
    1 . HO M O G E N E O U S D I S T R I B U T E D D B M S This is the case when the application programs are independent of how the database is distributed; i.e. if the distribution of the physical data can be altered without having to make alterations to the application programs. Here, all sites use the same DBMS product - same schemata and same data dictionaries.
  • 11.
    2 . HE T E R O G E N E O U S D I S T R I B U T E D D B M S This is the case when the application programs are dependent on the physical location of the stored data; i.e. application programs must be altered if data is moved from one site to another. Here, there are different kinds of DBMSs (i.e. Hierarchical, Network, Relational, Object., etc.), with different underlying data models.
  • 12.
    CHARACTERISTICS OF DISTRIBUTEDDBMS - A Distributed DBMS developed by a single vendor may contain: • Data Independence • Concurrency Control • Replication facilities • Recovery facilities
  • 13.
    FUNCTIONS OF DISTRIBUTEDDBMS • A DDBMS governs the storage and processing of logically related data over interconnected computer systems in which both data and processing functions are distributed among several sites. A DBMS must have at least the following functions to be classified as distributed : 1. Application interface to interact with the end user, application programs, and other DBMSs within the distributed database. 2. Validation to analyse data requests for syntax correctness. 3. Transformation to decompose complex requests into atomic data request components. 4. Query optimization to find the best access strategy. (Which database fragments must be accessed by the query, and how must data updates, if any, be synchronized?) 5. Mapping to determine the data location of local and remote fragments.
  • 14.
    ADVANTAGES • Data arelocated near the greatest demand site :- The data in a distributed database system are dispersed to match business requirements which reduce the cost of data access. • Faster data access :- End users often work with only a locally stored subset of the company’s data. • Faster data processing:- A distributed database system spreads out the systems workload by processing data at several sites. • Growth facilitation :- New sites can be added to the network without affecting the operations of other sites. • Improved communications :- Because local sites are smaller and located closer to customers, local sites foster better communication among departments and between customers and company staff.
  • 15.
    DISADVANTAGES • Complexity ofmanagement and control :- Applications must recognize data location, and they must be able to stitch together data from various sites. Database administrators must have the ability to coordinate database activities to prevent database degradation due to data anomalies. • Technological difficulty :- Data integrity, transaction management, concurrency control, security, backup, recovery, query optimization, access path selection, and so on, must all be addressed and resolved. • Security :- The probability of security lapses increases when data are located at multiple sites. The responsibility of data management will be shared by different people at several sites. • Lack of standards :- There are no standard communication protocols at the database level. For example, different database vendors employ different—and often incompatible—techniques to manage the distribution of data and processing in a DDBMS environment.