Advance Database Management Systems :31
Data Distribution Transparency
Prof Neeraj Bhargava
Vaibhav Khanna
Department of Computer Science
School of Engineering and Systems Sciences
Maharshi Dayanand Saraswati University Ajmer
Data Placement
• One of the most important decisions a distributed
database designer has to make is data placement.
Proper data placement is a crucial factor in determining
the success of a distributed database system.
• There are four basic alternatives: namely,
– centralized,
– replicated,
– partitioned, and
– hybrid.
• Some of these require additional analysis to fine-tune the
placement of data.
Transparency
• Ideally, the distribution in a distributed
database system should be transparent to
the user, whose interaction with the database
should be similar to that provided by a single,
centralized database.
• Forms of transparency that are desirable are:
Data distribution transparency.
– fragmentation transparency.
– location transparency.
– replication transparency.
Data distribution
transparency
Data distribution transparency.
• This type of transparency can take several
forms.
• The user should not have to know how the
data is fragmented, a property called
fragmentation transparency.
• Users should be unaware of the actual
location of the data items they access, a
property called location transparency.
• If data is replicated, users should be unaware
of the fact that multiple copies exist, a
property called replication transparency
Transparency
• To support these properties, it is essential that data item
names be unique.
• In a centralized system, it is easy to verify uniqueness.
• However, in a distributed system, it can be difficult to
ensure that no two sites use the same name for different
data items. It is possible to guarantee systemwide
uniqueness of names if each data item name has a
prefix that is the identifier of the site where the item
originated, called the birth site.
• Often the URL of the site is used. However, this
technique compromises location transparency.
Name server
• The problem can be solved by using
aliases for data items, which the system
can then map to the composite name.
• Another approach is to have a name
server where all names are checked for
uniqueness and registered.
• However, the server might become a
bottleneck, compromising performance,
and if the server fails, the entire system is
affected.
DBMS heterogeneity transparency.
• DBMS heterogeneity transparency. Users who need to
access data from a remote site with a different DBMS
from their local one should not be aware of the fact
that they are using a different DBMS.
• Their queries should be submitted in the language that
they normally use.
• For example, if they are Caché users, and the remote
site uses Oracle, they should be able to use the Caché
query form to submit queries.
• The system should take care of any translations
needed.
Transaction transparency
• The ACID properties of transactions, discussed for
centralized databases, must also be guaranteed for
distributed transactions.
• ACID (atomicity, consistency, isolation, durability) is a set
of properties of database transactions intended to
guarantee data validity despite errors, power failures, and
other mishaps.
• The system must therefore guarantee concurrency
transparency, which ensures that concurrent transactions
do not interfere with one another.
• It must also provide recovery transparency, handling
systemwide database recovery.
Performance transparency
• A distributed database system should deliver
performance that is comparable to one using
a centralized architecture.
• For example, a query optimizer should be
used to minimize response time for
distributed queries, so that users do not wait
too long for results.
Assignment
• What do you understand by Data distribution transparency.
Briefly explain fragmentation transparency, location
transparency and replication transparency

Adbms 31 data distribution transparency

  • 1.
    Advance Database ManagementSystems :31 Data Distribution Transparency Prof Neeraj Bhargava Vaibhav Khanna Department of Computer Science School of Engineering and Systems Sciences Maharshi Dayanand Saraswati University Ajmer
  • 2.
    Data Placement • Oneof the most important decisions a distributed database designer has to make is data placement. Proper data placement is a crucial factor in determining the success of a distributed database system. • There are four basic alternatives: namely, – centralized, – replicated, – partitioned, and – hybrid. • Some of these require additional analysis to fine-tune the placement of data.
  • 3.
    Transparency • Ideally, thedistribution in a distributed database system should be transparent to the user, whose interaction with the database should be similar to that provided by a single, centralized database. • Forms of transparency that are desirable are: Data distribution transparency. – fragmentation transparency. – location transparency. – replication transparency.
  • 4.
    Data distribution transparency Data distributiontransparency. • This type of transparency can take several forms. • The user should not have to know how the data is fragmented, a property called fragmentation transparency. • Users should be unaware of the actual location of the data items they access, a property called location transparency. • If data is replicated, users should be unaware of the fact that multiple copies exist, a property called replication transparency
  • 5.
    Transparency • To supportthese properties, it is essential that data item names be unique. • In a centralized system, it is easy to verify uniqueness. • However, in a distributed system, it can be difficult to ensure that no two sites use the same name for different data items. It is possible to guarantee systemwide uniqueness of names if each data item name has a prefix that is the identifier of the site where the item originated, called the birth site. • Often the URL of the site is used. However, this technique compromises location transparency.
  • 6.
    Name server • Theproblem can be solved by using aliases for data items, which the system can then map to the composite name. • Another approach is to have a name server where all names are checked for uniqueness and registered. • However, the server might become a bottleneck, compromising performance, and if the server fails, the entire system is affected.
  • 7.
    DBMS heterogeneity transparency. •DBMS heterogeneity transparency. Users who need to access data from a remote site with a different DBMS from their local one should not be aware of the fact that they are using a different DBMS. • Their queries should be submitted in the language that they normally use. • For example, if they are Caché users, and the remote site uses Oracle, they should be able to use the Caché query form to submit queries. • The system should take care of any translations needed.
  • 8.
    Transaction transparency • TheACID properties of transactions, discussed for centralized databases, must also be guaranteed for distributed transactions. • ACID (atomicity, consistency, isolation, durability) is a set of properties of database transactions intended to guarantee data validity despite errors, power failures, and other mishaps. • The system must therefore guarantee concurrency transparency, which ensures that concurrent transactions do not interfere with one another. • It must also provide recovery transparency, handling systemwide database recovery.
  • 9.
    Performance transparency • Adistributed database system should deliver performance that is comparable to one using a centralized architecture. • For example, a query optimizer should be used to minimize response time for distributed queries, so that users do not wait too long for results.
  • 10.
    Assignment • What doyou understand by Data distribution transparency. Briefly explain fragmentation transparency, location transparency and replication transparency