Distribution transparency and Distributed transaction

1. Distribution Transparency
Distribution transparency is the property of distributed
databases by the virtue of which the internal details of the
distribution are hidden from the users. The DDBMS designer
may choose to fragment tables, replicate the fragments and
store them at different sites. However, since users are
oblivious of these details, they find the distributed database
easy to use like any centralized database.
The three dimensions of distribution transparency are
Location transparency
Fragmentation transparency
Replication transparency

Location Transparency
 Location transparency ensures that the user can query on any
table(s) or fragment(s) of a table as if they were stored locally in
the user’s site. The fact that the table or its fragments are stored
at remote site in the distributed database system, should be
completely oblivious to the end user. The address of the remote
site(s) and the access mechanisms are completely hidden.
 In order to incorporate location transparency, DDBMS should
have access to updated and accurate data dictionary and
DDBMS directory which contains the details of locations of
data.

Fragmentation Transparency
 Fragmentation transparency enables users to query
upon any table as if it were un-fragmented. Thus, it
hides the fact that the table the user is querying on is
actually a fragment or union of some fragments. It also
conceals the fact that the fragments are located at
diverse sites.
 This is somewhat similar to users of SQL views, where
the user may not know that they are using a view of a
table instead of the table itself.

Replication Transparency
Replication transparency ensures that replication of databases
are hidden from the users. It enables users to query upon a table
as if only a single copy of the table exists.
Replication transparency is associated with concurrency
transparency and failure transparency. Whenever a user updates a
data item, the update is reflected in all the copies of the table.
However, this operation should not be known to the user. This is
concurrency transparency. Also, in case of failure of a site, the user
can still proceed with his queries using replicated copies without
any knowledge of failure. This is failure transparency.

Combination of Transparencies
 In any distributed database system, the designer should
ensure that all the stated transparencies are maintained
to a considerable extent. The designer may choose to
fragment tables, replicate them and store them at
different sites; all oblivious to the end user. However,
complete distribution transparency is a tough task and
requires considerable design efforts.

2. Explain Distributed Transaction
 A distributed transaction is a database
transaction in which two or more network
hosts are involved.
 Usually, hosts provide transactional
resources, while the transaction manager is
responsible for creating and managing a
global transaction that encompasses all
operations against such resources.

There are 4 properties:
Atomicity
 Atomicity means that you can guarantee that all of a transaction
happens, or none of it does; you can do complex operations as one
single unit, all or nothing, and a crash, power failure, error, or anything
else won't allow you to be in a state in which only some of the related
changes have happened.
Consistency
 Consistency means that you guarantee that your data will be
consistent; none of the constraints you have on related data will ever
be violated.
3.Isolation
 Isolation means that one transaction cannot read data from another
transaction that is not yet completed. If two transactions are executing
concurrently, each one will see the world as if they were executing
sequentially, and if one needs to read data that is written by another,
it will have to wait until the other is finished.

4.Durability
Durability means that once a transaction is complete, it is guaranteed that all of
the changes have been recorded to a durable medium (such as a hard disk), and
the fact that the transaction has been completed is likewise recorded.

3. How deadlock detection is different
for a distributed system
Deadlock detection algorithms get simplified
by maintaining Wait-for-graph (WFG) and
searching for cycles.
The different approaches for deadlock
detection are:

 Centralized Approach for Deadlock
Detection
 In this approach a local coordinator at each
site maintains a WFG for its local resources
and a central coordinator for constructing the
union of all the individual WFGs.
The central coordinator constructs the global
WFG from the information received from the
local coordinators of all sites.

Hierarchical Approach for Deadlock
Detection:
The hierarchical approach overcomes drawbacks
of the centralized approach.
This approach uses a logical hierarchy of deadlock
detectors called as controllers.
Each controller detects only those deadlocks that
have the sites falling within the range of the
hierarchy. Global WFG is distributed over a number
of different controllers in this approach.

Fully Distributed Approaches for Deadlock
Detection
In this approach each site shares equal
responsibility for deadlock detection.
 The first algorithm is based on construction of
WFG and second one is a probe-based
algorithm.

4. Comparison between Process and
Thread:
Process Thread
Definition
An executing instance of a program is called a
process.
A thread is a subset of the process.
Process
It has its own copy of the data segment of the
parent process.
It has direct access to the data segment of its
process.
Communication
Processes must use inter-process
communication to communicate with sibling
processes.
Threads can directly communicate with other
threads of its process.
Overheads Processes have considerable overhead. Threads have almost no overhead.
Creation
New processes require duplication of the
parent process.
New threads are easily created.

Control
Processes can only exercise control over
child processes.
Threads can exercise considerable control
over threads of the same process.
Changes
Any change in the parent process does not
affect child processes.
Any change in the main thread may affect
the behavior of the other threads of the
process.
Memory Run in separate memory spaces. Run in shared memory spaces.
File descriptors
Most file descriptors are not shared. It shares file descriptors.
File system There is no sharing of file system context. It shares file system context.

Signal
It does not share signal
handling.
It shares signal handling.
Controlled by
Process is controlled by the
operating system.
Threads are controlled by
programmer in a program.
Dependence Processes are independent. Threads are dependent.

Types of Thread -
 Threads are implemented in following two ways −
1. User Level Threads − User managed threads.
1. Kernel Level Threads − Operating System managed
threads acting on kernel, an operating system core

1. User Level Threads-
In this case, the thread management kernel is not
aware of the existence of threads. The thread library
contains code for creating and destroying threads, for
passing message and data between threads, for
scheduling thread execution and for saving and
restoring thread contexts. The application starts with
a single thread.

Advantages -
oThread switching does not require Kernel mode privileges.
oUser level thread can run on any operating system.
oScheduling can be application specific in the user level
thread.
oUser level threads are fast to create and manage.
Disadvantages -
oIn a typical operating system, most system calls are blocking.
oMultithreaded application cannot take advantage of
multiprocessing.

2. Kernel Level Threads -
In this case, thread management is done by the
Kernel. There is no thread management code in the
application area. Kernel threads are supported
directly by the operating system. Any application can
be programmed to be multithreaded. All of the
threads within an application are supported within a
single process.
The Kernel maintains context information for the
process as a whole and for individuals threads within
the process. Scheduling by the Kernel is done on a
thread basis. The Kernel performs thread creation,
scheduling and management in Kernel space. Kernel
threads are generally slower to create and manage
than the user threads.

Advantages -
oKernel can simultaneously schedule multiple threads from
the same process on multiple processes.
oIf one thread in a process is blocked, the Kernel can
schedule another thread of the same process.
oKernel routines themselves can be multithreaded.
Disadvantages -
oKernel threads are generally slower to create and manage
than the user threads.
oTransfer of control from one thread to another within the
same process requires a mode switch to the Kernel.

Distribution transparency and Distributed transaction

More Related Content

What's hot

Similar to Distribution transparency and Distributed transaction

More from shraddha mane

Recently uploaded

Distribution transparency and Distributed transaction