Distributed databases allow data to be stored across multiple interconnected sites. They provide advantages like increased availability, improved performance, and massive scalability. Distributed databases can be either homogeneous, where all sites use identical database systems, or heterogeneous, where sites may use different systems requiring translations. Data can be distributed through replication, where copies are stored at each site, or fragmentation, where relations are divided into parts stored at different sites. Major companies like Amazon, Google, Netflix and Uber use distributed databases to handle large volumes of diverse data across global networks.
2. 02
What is a Distributed
Database ?
01
Distributed
Database Types
05
Concepts related to
database
07
Distributed Database
Advantages
03
Example of
distributed database
Table Of Content
Distributed
Database
Applications
06
Distributed
Database
components
04
3. Centralized Database Problems
“Normal Database”
● Data traffic is more.
● If any kind of system failure occurs in the
centralized system then the entire data will
be destroyed.
Sql Server DB
4. Can we just enhance the server ?
هل
يمكن
أن
نكتفي
بتحسين
أداء
الخادم
فقط
؟
15. Possible absence of
homogeneity among
connected nodes
03
Logical interrelation of the
connected databases
02
Connection of database nodes
over computer network
01
Distributed Database Components
17. Distributed Database Classifications
Distributed databases can be broadly classified into homogeneous and
heterogeneous distributed database environments.
Each has its own pros and cons and sub-divisions
18. 1. Homogenous distributed database ()متجانسة
In a homogeneous database, all different sites store database identically. The operating system,
database management system (DBMS), and the data structures used (Schema) – all are the same
at all sites. Hence, they’re easy to manage, (DDBMS).
Distributed Database Classifications
19. 2. Heterogeneous distributed database ( غير
متجانسة )
In a heterogeneous distributed database, different sites can use different schema and software
that can lead to problems in query processing (Complex) and transactions. Also, a particular site
might be completely unaware of the other sites. Different computers may use a different
operating system, different database application. They may even use different data models for
the database.
Distributed Database Classifications
Therefore, translations are
required for different sites to
communicate.
Also it requires more knowledge
to manage the database
(harder).
21. Distributed Database Classifications
Types of Homogeneous Distributed Databases:
1. Autonomous
Each database is independent that functions on its own. They are integrated by a controlling application and use
message passing to share data updates.
1. Non-autonomous
Data is distributed across the homogeneous nodes and a central or master DBMS co-ordinates data updates across
the sites.
Types of Heterogeneous Distributed Databases
1. Federated (Single Schema)
The heterogeneous database systems are independent in nature and integrated together so that they function as a
single database system.
2. Multidatabase (No Schema): There is no one conceptual global schema. For data access a schema is
constructed dynamically as needed by the application software.
22. Distributed Data Storage
There are 2 ways in which data can be stored on different sites.
1. Replication
In this approach, the entire relationship is stored redundantly at 2 or more sites. If the entire database is
available at all sites, it is a fully redundant database. Hence, in replication, systems maintain copies of data.
This is advantageous as it increases the availability of data at different sites. Also, now query requests can be
processed in parallel.
Also there’s disadvantages such Data needs to be constantly updated. Any change made at one site needs to
be recorded at every site that relation is stored or else it may lead to inconsistency which causes overhead,
concurrency control becomes way more complex.
1. Fragmentation
In this approach, the relations are fragmented (i.e., they’re divided into smaller parts) and each of the
fragments is stored in different sites where they’re required.
Fragmentation is advantageous as it doesn’t create copies of data.
Distributed Database Classifications
24. Distributed Databases Applications
Distributed Systems and Distributed Databases.
Distributed Systems: A distributed system is a computing environment in which various
components are spread across multiple computers (or multiple locations) on a network that
shares information with each other.
25. Applications of Distributed Database:
1. It is used in Corporate Management Information System.
2. It is used in multimedia applications.
3. Used in Military’s control system, Hotel chains etc.
4. It is also used in manufacturing control system.
5. Big Data and data mining
Distributed Databases Applications
When/where to use Distributed Databases !?
28. Apple (uses cassandra database to distribute 100 petabytes of data)
Netflix
Uber
eBay
Zalando ( شركة
ألمانية
لبيع
االحذية
واألزياء )
SoundCloud
Banks in general
Distributed Databases Applications
Companies That use Distributed Databases:
30. Concepts related to Distributed Databases
Transparency Availability
Reliability
Failures Handling
31. Transparency ()الشفافية
Concepts related to Distributed Databases
It’s all about hiding implementation details from end users.
Transparency Includes to not share the following:
● Data organization transparency
● Data Fragmentation Transparency
○ Horizontal Fragmentation (tuples/rows)
○ Vertical Fragmentation (attributes/columns)
● Database Design
● Execution Transparency
33. Availability
Reliability
Availability is the probability that the system is continuously available during
a time interval
Reliability is broadly defined as the probability that a system is running (not
down) at a certain time point
Concepts related to Distributed Databases
34. Failures Handling
Both Availability and Reliability are affected if any failure or error occurs.
It must have a mechanism to deal with errors and failures when they occur
Concepts related to Distributed Databases