Distributed Data Base SystemRAVINDER CHAMOLIMSC[CS]OMIT, RISHIKESH
Distributed Data Base ManagementA distributed data base system, the database is stored on several computers . A distributed database is a collection of multiple , Logic computer network .The Computer in a distributed system communicate with one another through various communication media, such as high-speed networks or telephone lines .
They do not share main-memory or disks .The computer in a distributed systems may vary in size and function ,ranging from workstation up to mainframe systems .The computer in a distributed system are referred to by a number , such as sites or nodes .The sites terms mainly used to emphasize the physical distribution of these systems .
Reasons for Building Distributed Database SystemsSharing data:o The major advantage in building distributed database system in the provision of an environment where users at one site may be able to access the data resending at over other sites .Autonomy:o The primary advantages of sharing data by means of distribution is that each site is able to retain a degree of control over data that are stored locally .
Availability:o If one site fails in a distributed systems , the remaining sites may be able to continue operating .in particular, if data items are replicated in the several sites ,a transaction needing a particular data item may find that item in any of several sites .o Thus ,the failures of a sites does not necessarily imply the stud down of the system.
THE PROPERTY OF DISTRIBUTED DATABASE SYSTEMSDistributed Database System should makes impact of data distributed transparent . Distributed Database System have two major property .o Distributed Data Independence .o Distributed Transaction atomicity
Distributed Data IndependenceDDI property enable user to ask queries without specifying where the reference relation copies or fragments of the relation are located .This principle is a natural extension of physical and logical data Independence .
DISTRIBUTED TRANSACTION ATOMICITYDistributed transition atomicity property enables users to write transitions that access and update data at several sites .They would write transitions over purely local data the effects of transition across sites should continue to be atomic .
Types of Distributed Data Base1. Homogeneous Distributed Data Base .2. Heterogeneous Distributed Data Base
Homogeneous Distributed Data BaseHomogeneous distributed data base is simplest from of a distributed data base where there are several sites each running their own application on the same DBMS software .All sites have identical DBMS software .All user use identical software are aware of one another and agree to cooperate in processing user request .
Heterogeneous Distributed Data BaseHeterogeneous distributed data base systems different sites run under the control of different DBMS softwares . Heterogeneous distributed data base systems is also referred to s multi-database systems or a federated data base system(FDBS) .It’s well accepted standards for gateway protocols to expose DBMS functionality to external application.
The Gateway protocols help to make communicate the different sites
Distributed Data Storage Consider a relation ‘r’ that is to be stored in the database .there are two approaches to storing this relation in the distributed database :o Replication :o The System maintains several identical replicas (copies) of relation, and stores each replica at different site. The alternative to replication is to store only one copy of the relation ‘r’.
o Fragmentation:o The System Partitions the relation into several fragments, and stores each fragment at different site.o Fragmentation and Replication can be combined :o A relation can be partitioned into several fragments and there may be several replicas of each fragment .
TransparencyThe user of a distributed database system should not be required to know either where the data are physically located or how the data can be accessed at the specific local site. This characteristic called DATA TRANSPARENCYData Transparency can take several forms:o Fragmentation Transparencyo Replication Transparencyo Location Transparency
System StructureEach site has its own local transaction manager, whose function is to ensure the ACID properties of those transaction that execute at that site.The various transaction manager cooperate to execute global transaction.
System StructureTo understand how such a manager can be implemented ,consider abstract model of a transaction system, in which each site contains two sub system .The Transaction manager manages the execution of those transaction (or sub- transaction ) that access data stored in a local site.
Note that each such transaction may be either a local transaction(that is a transaction that executes at only that site ) or part of a global transaction ( that is a transaction that executes at several sites) . The transaction coordinator coordinates the execution of the of the various transaction (both local and global ) initiated at that site.
Distributed Query ProcessingIn this distributed system, we must take into account several other matter ,including .o The cost of data transition over the network.o The potential gain in performance from having several sites process parts of the query in parallel .
The relative cost of data transfer over the network and data transfer to and from disk various widely depending on the type of network and on the speed of the disks .Thus, in general ,we cannot focus solely on disk costs or on network costs. Rather , we must find a good trade off between the two.
The System Failure ModesA distributed systems may suffer from types of failure that a centralized systems does (for example, software errors, hardware error, or disk crashes) .The basic failure types areo Failure of site .o Lass of messages .o Failure of a communication link .o Network partition
To provide high availability, a distributed database must detect failures, reconfigure itself so that computation may continue, and recover when a processor or a link is repaired.The task is greatly complication by the fact that it is hard to distinguish between network partitions or sites failures
Other Important IssuesCommit Protocolso If we are to ensure atomicity, all the sites in which a transaction T executed agree on the final outcomes of the execution must either commit at all sites, or it must abort at all sites .o To ensure this property , the transaction coordinator of t must executed a commit protocols
Time stampingo The principal idea behind the time stamping in is that each transaction is given a unique time stamp that the system user in deciding the serialization order .