SlideShare a Scribd company logo
Principles of Distributed Database
Systems
M. Tamer Özsu
Patrick Valduriez
© 2020, M.T. Özsu & P. Valduriez 1
Outline
 Introduction
 Distributed and Parallel Database Design
 Distributed Data Control
 Distributed Query Processing
 Distributed Transaction Processing
 Data Replication
 Database Integration – Multidatabase Systems
 Parallel Database Systems
 Peer-to-Peer Data Management
 Big Data Processing
 NoSQL, NewSQL and Polystores
 Web Data Management
© 2020, M.T. Özsu & P. Valduriez 2
Outline
 Introduction
 What is a distributed DBMS
 History
 Distributed DBMS promises
 Design issues
 Distributed DBMS architecture
© 2020, M.T. Özsu & P. Valduriez 3
Distributed Computing
 A number of autonomous processing elements (not
necessarily homogeneous) that are interconnected by a
computer network and that cooperate in performing their
assigned tasks.
 What is being distributed?
 Processing logic
 Function
 Data
 Control
© 2020, M.T. Özsu & P. Valduriez 4
Current Distribution – Geographically
Distributed Data Centers
© 2020, M.T. Özsu & P. Valduriez 5
What is a Distributed Database System?
A distributed database is a collection of multiple, logically
interrelated databases distributed over a computer network
A distributed database management system (Distributed
DBMS) is the software that manages the DDB and provides
an access mechanism that makes this distribution
transparent to the users
© 2020, M.T. Özsu & P. Valduriez 6
What is not a DDBS?
 A timesharing computer system
 A loosely or tightly coupled multiprocessor system
 A database system which resides at one of the nodes of
a network of computers - this is a centralized database
on a network node
© 2020, M.T. Özsu & P. Valduriez 7
Distributed DBMS Environment
© 2020, M.T. Özsu & P. Valduriez 8
Implicit Assumptions
 Data stored at a number of sites → each site logically
consists of a single processor
 Processors at different sites are interconnected by a
computer network → not a multiprocessor system
 Parallel database systems
 Distributed database is a database, not a collection of
files → data logically related as exhibited in the users’
access patterns
 Relational data model
 Distributed DBMS is a full-fledged DBMS
 Not remote file system, not a TP system
© 2020, M.T. Özsu & P. Valduriez 9
Important Point
Logically integrated
but
Physically distributed
© 2020, M.T. Özsu & P. Valduriez 10
Outline
 Introduction

 History



© 2020, M.T. Özsu & P. Valduriez 11
History – File Systems
© 2020, M.T. Özsu & P. Valduriez 12
History – Database Management
© 2020, M.T. Özsu & P. Valduriez 13
History – Early Distribution
© 2020, M.T. Özsu & P. Valduriez 14
Peer-to-Peer (P2P)
History – Client/Server
© 2020, M.T. Özsu & P. Valduriez 15
History – Data Integration
© 2020, M.T. Özsu & P. Valduriez 16
History – Cloud Computing
© 2020, M.T. Özsu & P. Valduriez 17
On-demand, reliable services provided over the Internet in
a cost-efficient manner
 Cost savings: no need to maintain dedicated compute
power
 Elasticity: better adaptivity to changing workload
Data Delivery Alternatives
 Delivery modes
 Pull-only
 Push-only
 Hybrid
 Frequency
 Periodic
 Conditional
 Ad-hoc or irregular
 Communication Methods
 Unicast
 One-to-many
 Note: not all combinations make sense
© 2020, M.T. Özsu & P. Valduriez 18
Outline
 Introduction


 Distributed DBMS promises


© 2020, M.T. Özsu & P. Valduriez 19
Distributed DBMS Promises
 Transparent management of distributed, fragmented,
and replicated data
 Improved reliability/availability through distributed
transactions
 Improved performance
 Easier and more economical system expansion
© 2020, M.T. Özsu & P. Valduriez
Transparency
 Transparency is the separation of the higher-level
semantics of a system from the lower level
implementation issues.
 Fundamental issue is to provide
data independence
in the distributed environment
 Network (distribution) transparency
 Replication transparency
 Fragmentation transparency
 horizontal fragmentation: selection
 vertical fragmentation: projection
 hybrid
© 2020, M.T. Özsu & P. Valduriez
Example
© 2020, M.T. Özsu & P. Valduriez 22
Transparent Access
SELECT ENAME,SAL
FROM EMP,ASG,PAY
WHERE DUR > 12
AND EMP.ENO = ASG.ENO
AND PAY.TITLE = EMP.TITLE
Paris projects
Paris employees
Paris assignments
Boston employees
Montreal projects
Paris projects
New York projects
with budget > 200000
Montreal employees
Montreal assignments
Boston
Communication
Network
Montreal
Paris
New
York
Boston projects
Boston employees
Boston assignments
Boston projects
New York employees
New York projects
New York assignments
Tokyo
© 2020, M.T. Özsu & P. Valduriez 23
Distributed Database - User View
Distributed Database
© 2020, M.T. Özsu & P. Valduriez 24
Distributed DBMS - Reality
Communication
Subsystem
DBMS
Software
User
Application
User
Query
DBMS
Software
DBMS
Software
DBMS
Software
User
Query
DBMS
Software
User
Query
User
Application
© 2020, M.T. Özsu & P. Valduriez 25
Types of Transparency
 Data independence
 Network transparency (or distribution transparency)
 Location transparency
 Fragmentation transparency
 Fragmentation transparency
 Replication transparency
© 2020, M.T. Özsu & P. Valduriez 26
Reliability Through Transactions
 Replicated components and data should make distributed
DBMS more reliable.
 Distributed transactions provide
 Concurrency transparency
 Failure atomicity
• Distributed transaction support requires implementation of
 Distributed concurrency control protocols
 Commit protocols
 Data replication
 Great for read-intensive workloads, problematic for updates
 Replication protocols
© 2020, M.T. Özsu & P. Valduriez 27
Potentially Improved Performance
 Proximity of data to its points of use
 Requires some support for fragmentation and replication
 Parallelism in execution
 Inter-query parallelism
 Intra-query parallelism
© 2020, M.T. Özsu & P. Valduriez 28
Scalability
 Issue is database scaling and workload scaling
 Adding processing and storage power
 Scale-out: add more servers
 Scale-up: increase the capacity of one server → has limits
© 2020, M.T. Özsu & P. Valduriez 29
Outline
 Introduction



 Design issues

© 2020, M.T. Özsu & P. Valduriez 30
Distributed DBMS Issues
 Distributed database design
 How to distribute the database
 Replicated & non-replicated database distribution
 A related problem in directory management
 Distributed query processing
 Convert user transactions to data manipulation instructions
 Optimization problem
 min{cost = data transmission + local processing}
 General formulation is NP-hard
© 2020, M.T. Özsu & P. Valduriez 31
Distributed DBMS Issues
 Distributed concurrency control
 Synchronization of concurrent accesses
 Consistency and isolation of transactions' effects
 Deadlock management
 Reliability
 How to make the system resilient to failures
 Atomicity and durability
© 2020, M.T. Özsu & P. Valduriez 32
Distributed DBMS Issues
 Replication
 Mutual consistency
 Freshness of copies
 Eager vs lazy
 Centralized vs distributed
 Parallel DBMS
 Objectives: high scalability and performance
 Not geo-distributed
 Cluster computing
© 2020, M.T. Özsu & P. Valduriez 33
Related Issues
 Alternative distribution approaches
 Modern P2P
 World Wide Web (WWW or Web)
 Big data processing
 4V: volume, variety, velocity, veracity
 MapReduce & Spark
 Stream data
 Graph analytics
 NoSQL
 NewSQL
 Polystores
© 2020, M.T. Özsu & P. Valduriez 34
Outline
 Introduction




 Distributed DBMS architecture
© 2020, M.T. Özsu & P. Valduriez 35
DBMS Implementation Alternatives
© 2020, M.T. Özsu & P. Valduriez 36
Dimensions of the Problem
 Distribution
 Whether the components of the system are located on the same machine or
not
 Heterogeneity
 Various levels (hardware, communications, operating system)
 DBMS important one
 data model, query language,transaction management algorithms
 Autonomy
 Not well understood and most troublesome
 Various versions
 Design autonomy: Ability of a component DBMS to decide on issues related to its
own design.
 Communication autonomy: Ability of a component DBMS to decide whether and
how to communicate with other DBMSs.
 Execution autonomy: Ability of a component DBMS to execute local operations in
any manner it wants to.
© 2020, M.T. Özsu & P. Valduriez 37
Client/Server Architecture
© 2020, M.T. Özsu & P. Valduriez 38
Advantages of Client-Server
Architectures
 More efficient division of labor
 Horizontal and vertical scaling of resources
 Better price/performance on client machines
 Ability to use familiar tools on client machines
 Client access to remote data (via standards)
 Full DBMS functionality provided to client workstations
 Overall better system price/performance
© 2020, M.T. Özsu & P. Valduriez 39
Database Server
© 2020, M.T. Özsu & P. Valduriez 40
Distributed Database Servers
© 2020, M.T. Özsu & P. Valduriez 41
Peer-to-Peer Component Architecture
© 2020, M.T. Özsu & P. Valduriez 42
MDBS Components & Execution
© 2020, M.T. Özsu & P. Valduriez 43
Mediator/Wrapper Architecture
© 2020, M.T. Özsu & P. Valduriez 44
Cloud Computing
© 2020, M.T. Özsu & P. Valduriez 45
On-demand, reliable services provided over the Internet in
a cost-efficient manner
‫التكلفة‬ ‫حيث‬ ‫من‬ ‫فعالة‬ ‫بطريقة‬ ‫اإلنترنت‬ ‫عبر‬ ‫مقدمة‬ ‫الطلب‬ ‫عند‬ ‫موثوقة‬ ‫خدمات‬
 IaaS – Infrastructure-as-a-Service
 PaaS – Platform-as-a-Service
 SaaS – Software-as-a-Service
 DaaS – Database-as-a-Service
Simplified Cloud Architecture
© 2020, M.T. Özsu & P. Valduriez 46

More Related Content

Similar to 1- Introduction for software engineering

Sycamore Quantum Computer 2019 developed.pptx
Sycamore Quantum Computer 2019 developed.pptxSycamore Quantum Computer 2019 developed.pptx
Sycamore Quantum Computer 2019 developed.pptx
shujee381
 
Data and Application Modernization in the Age of the Cloud
Data and Application Modernization in the Age of the CloudData and Application Modernization in the Age of the Cloud
Data and Application Modernization in the Age of the Cloud
redmondpulver
 
Reinventing and Simplifying Data Management for a Successful Hybrid and Multi...
Reinventing and Simplifying Data Management for a Successful Hybrid and Multi...Reinventing and Simplifying Data Management for a Successful Hybrid and Multi...
Reinventing and Simplifying Data Management for a Successful Hybrid and Multi...
Denodo
 
Introduction to Modern Data Virtualization (US)
Introduction to Modern Data Virtualization (US)Introduction to Modern Data Virtualization (US)
Introduction to Modern Data Virtualization (US)
Denodo
 
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Denodo
 
AtomicDBCoreTech_White Papaer
AtomicDBCoreTech_White PapaerAtomicDBCoreTech_White Papaer
AtomicDBCoreTech_White Papaer
JEAN-MICHEL LETENNIER
 
Itc571 Project Presentation
Itc571 Project PresentationItc571 Project Presentation
Itc571 Project Presentation
Dinh Khue
 
Presentation on Databases in the Cloud
Presentation on Databases in the CloudPresentation on Databases in the Cloud
Presentation on Databases in the Cloud
moshfiq
 
Govern and Protect Your End User Information
Govern and Protect Your End User InformationGovern and Protect Your End User Information
Govern and Protect Your End User Information
Denodo
 
¿Cómo modernizar una arquitectura de TI con la virtualización de datos?
¿Cómo modernizar una arquitectura de TI con la virtualización de datos?¿Cómo modernizar una arquitectura de TI con la virtualización de datos?
¿Cómo modernizar una arquitectura de TI con la virtualización de datos?
Denodo
 
DBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptxDBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptx
Hong Ong
 
Unit-I_dbms_TT_Final.pptx
Unit-I_dbms_TT_Final.pptxUnit-I_dbms_TT_Final.pptx
Unit-I_dbms_TT_Final.pptx
UnknownUnknown252665
 
A Successful Journey to the Cloud with Data Virtualization
A Successful Journey to the Cloud with Data VirtualizationA Successful Journey to the Cloud with Data Virtualization
A Successful Journey to the Cloud with Data Virtualization
Denodo
 
cloud computing, touch screen, dms and cores
cloud computing, touch screen, dms and corescloud computing, touch screen, dms and cores
cloud computing, touch screen, dms and cores
Wajiha Muhammad Ismail
 
cc.doc
cc.doccc.doc
cc.doc
maheshlucky3
 
Challenges Management and Opportunities of Cloud DBA
Challenges Management and Opportunities of Cloud DBAChallenges Management and Opportunities of Cloud DBA
Challenges Management and Opportunities of Cloud DBA
inventy
 
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Denodo
 
Myth Busters VII: I’m building a data mesh, so I don’t need data virtualization
Myth Busters VII: I’m building a data mesh, so I don’t need data virtualizationMyth Busters VII: I’m building a data mesh, so I don’t need data virtualization
Myth Busters VII: I’m building a data mesh, so I don’t need data virtualization
Denodo
 
2014 IEEE DOTNET CLOUD COMPUTING PROJECT Distributed -concurrent--and-indepen...
2014 IEEE DOTNET CLOUD COMPUTING PROJECT Distributed -concurrent--and-indepen...2014 IEEE DOTNET CLOUD COMPUTING PROJECT Distributed -concurrent--and-indepen...
2014 IEEE DOTNET CLOUD COMPUTING PROJECT Distributed -concurrent--and-indepen...
IEEEFINALSEMSTUDENTPROJECTS
 
Cloud and Bid data Dr.VK.pdf
Cloud and Bid data Dr.VK.pdfCloud and Bid data Dr.VK.pdf
Cloud and Bid data Dr.VK.pdf
kalai75
 

Similar to 1- Introduction for software engineering (20)

Sycamore Quantum Computer 2019 developed.pptx
Sycamore Quantum Computer 2019 developed.pptxSycamore Quantum Computer 2019 developed.pptx
Sycamore Quantum Computer 2019 developed.pptx
 
Data and Application Modernization in the Age of the Cloud
Data and Application Modernization in the Age of the CloudData and Application Modernization in the Age of the Cloud
Data and Application Modernization in the Age of the Cloud
 
Reinventing and Simplifying Data Management for a Successful Hybrid and Multi...
Reinventing and Simplifying Data Management for a Successful Hybrid and Multi...Reinventing and Simplifying Data Management for a Successful Hybrid and Multi...
Reinventing and Simplifying Data Management for a Successful Hybrid and Multi...
 
Introduction to Modern Data Virtualization (US)
Introduction to Modern Data Virtualization (US)Introduction to Modern Data Virtualization (US)
Introduction to Modern Data Virtualization (US)
 
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)
 
AtomicDBCoreTech_White Papaer
AtomicDBCoreTech_White PapaerAtomicDBCoreTech_White Papaer
AtomicDBCoreTech_White Papaer
 
Itc571 Project Presentation
Itc571 Project PresentationItc571 Project Presentation
Itc571 Project Presentation
 
Presentation on Databases in the Cloud
Presentation on Databases in the CloudPresentation on Databases in the Cloud
Presentation on Databases in the Cloud
 
Govern and Protect Your End User Information
Govern and Protect Your End User InformationGovern and Protect Your End User Information
Govern and Protect Your End User Information
 
¿Cómo modernizar una arquitectura de TI con la virtualización de datos?
¿Cómo modernizar una arquitectura de TI con la virtualización de datos?¿Cómo modernizar una arquitectura de TI con la virtualización de datos?
¿Cómo modernizar una arquitectura de TI con la virtualización de datos?
 
DBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptxDBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptx
 
Unit-I_dbms_TT_Final.pptx
Unit-I_dbms_TT_Final.pptxUnit-I_dbms_TT_Final.pptx
Unit-I_dbms_TT_Final.pptx
 
A Successful Journey to the Cloud with Data Virtualization
A Successful Journey to the Cloud with Data VirtualizationA Successful Journey to the Cloud with Data Virtualization
A Successful Journey to the Cloud with Data Virtualization
 
cloud computing, touch screen, dms and cores
cloud computing, touch screen, dms and corescloud computing, touch screen, dms and cores
cloud computing, touch screen, dms and cores
 
cc.doc
cc.doccc.doc
cc.doc
 
Challenges Management and Opportunities of Cloud DBA
Challenges Management and Opportunities of Cloud DBAChallenges Management and Opportunities of Cloud DBA
Challenges Management and Opportunities of Cloud DBA
 
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
 
Myth Busters VII: I’m building a data mesh, so I don’t need data virtualization
Myth Busters VII: I’m building a data mesh, so I don’t need data virtualizationMyth Busters VII: I’m building a data mesh, so I don’t need data virtualization
Myth Busters VII: I’m building a data mesh, so I don’t need data virtualization
 
2014 IEEE DOTNET CLOUD COMPUTING PROJECT Distributed -concurrent--and-indepen...
2014 IEEE DOTNET CLOUD COMPUTING PROJECT Distributed -concurrent--and-indepen...2014 IEEE DOTNET CLOUD COMPUTING PROJECT Distributed -concurrent--and-indepen...
2014 IEEE DOTNET CLOUD COMPUTING PROJECT Distributed -concurrent--and-indepen...
 
Cloud and Bid data Dr.VK.pdf
Cloud and Bid data Dr.VK.pdfCloud and Bid data Dr.VK.pdf
Cloud and Bid data Dr.VK.pdf
 

Recently uploaded

2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
Yasser Mahgoub
 
AI assisted telemedicine KIOSK for Rural India.pptx
AI assisted telemedicine KIOSK for Rural India.pptxAI assisted telemedicine KIOSK for Rural India.pptx
AI assisted telemedicine KIOSK for Rural India.pptx
architagupta876
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
Madan Karki
 
Software Quality Assurance-se412-v11.ppt
Software Quality Assurance-se412-v11.pptSoftware Quality Assurance-se412-v11.ppt
Software Quality Assurance-se412-v11.ppt
TaghreedAltamimi
 
Engineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdfEngineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdf
abbyasa1014
 
Material for memory and display system h
Material for memory and display system hMaterial for memory and display system h
Material for memory and display system h
gowrishankartb2005
 
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURSCompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
RamonNovais6
 
Null Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAMNull Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAM
Divyanshu
 
Generative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of contentGenerative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of content
Hitesh Mohapatra
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
Madan Karki
 
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
ecqow
 
An Introduction to the Compiler Designss
An Introduction to the Compiler DesignssAn Introduction to the Compiler Designss
An Introduction to the Compiler Designss
ElakkiaU
 
Certificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi AhmedCertificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi Ahmed
Mahmoud Morsy
 
Mechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdfMechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdf
21UME003TUSHARDEB
 
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
ydzowc
 
artificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptxartificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptx
GauravCar
 
CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1
PKavitha10
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
VICTOR MAESTRE RAMIREZ
 
cnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classicationcnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classication
SakkaravarthiShanmug
 
ITSM Integration with MuleSoft.pptx
ITSM  Integration with MuleSoft.pptxITSM  Integration with MuleSoft.pptx
ITSM Integration with MuleSoft.pptx
VANDANAMOHANGOUDA
 

Recently uploaded (20)

2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
 
AI assisted telemedicine KIOSK for Rural India.pptx
AI assisted telemedicine KIOSK for Rural India.pptxAI assisted telemedicine KIOSK for Rural India.pptx
AI assisted telemedicine KIOSK for Rural India.pptx
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
 
Software Quality Assurance-se412-v11.ppt
Software Quality Assurance-se412-v11.pptSoftware Quality Assurance-se412-v11.ppt
Software Quality Assurance-se412-v11.ppt
 
Engineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdfEngineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdf
 
Material for memory and display system h
Material for memory and display system hMaterial for memory and display system h
Material for memory and display system h
 
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURSCompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
 
Null Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAMNull Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAM
 
Generative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of contentGenerative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of content
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
 
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
 
An Introduction to the Compiler Designss
An Introduction to the Compiler DesignssAn Introduction to the Compiler Designss
An Introduction to the Compiler Designss
 
Certificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi AhmedCertificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi Ahmed
 
Mechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdfMechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdf
 
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
 
artificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptxartificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptx
 
CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
 
cnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classicationcnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classication
 
ITSM Integration with MuleSoft.pptx
ITSM  Integration with MuleSoft.pptxITSM  Integration with MuleSoft.pptx
ITSM Integration with MuleSoft.pptx
 

1- Introduction for software engineering

  • 1. Principles of Distributed Database Systems M. Tamer Özsu Patrick Valduriez © 2020, M.T. Özsu & P. Valduriez 1
  • 2. Outline  Introduction  Distributed and Parallel Database Design  Distributed Data Control  Distributed Query Processing  Distributed Transaction Processing  Data Replication  Database Integration – Multidatabase Systems  Parallel Database Systems  Peer-to-Peer Data Management  Big Data Processing  NoSQL, NewSQL and Polystores  Web Data Management © 2020, M.T. Özsu & P. Valduriez 2
  • 3. Outline  Introduction  What is a distributed DBMS  History  Distributed DBMS promises  Design issues  Distributed DBMS architecture © 2020, M.T. Özsu & P. Valduriez 3
  • 4. Distributed Computing  A number of autonomous processing elements (not necessarily homogeneous) that are interconnected by a computer network and that cooperate in performing their assigned tasks.  What is being distributed?  Processing logic  Function  Data  Control © 2020, M.T. Özsu & P. Valduriez 4
  • 5. Current Distribution – Geographically Distributed Data Centers © 2020, M.T. Özsu & P. Valduriez 5
  • 6. What is a Distributed Database System? A distributed database is a collection of multiple, logically interrelated databases distributed over a computer network A distributed database management system (Distributed DBMS) is the software that manages the DDB and provides an access mechanism that makes this distribution transparent to the users © 2020, M.T. Özsu & P. Valduriez 6
  • 7. What is not a DDBS?  A timesharing computer system  A loosely or tightly coupled multiprocessor system  A database system which resides at one of the nodes of a network of computers - this is a centralized database on a network node © 2020, M.T. Özsu & P. Valduriez 7
  • 8. Distributed DBMS Environment © 2020, M.T. Özsu & P. Valduriez 8
  • 9. Implicit Assumptions  Data stored at a number of sites → each site logically consists of a single processor  Processors at different sites are interconnected by a computer network → not a multiprocessor system  Parallel database systems  Distributed database is a database, not a collection of files → data logically related as exhibited in the users’ access patterns  Relational data model  Distributed DBMS is a full-fledged DBMS  Not remote file system, not a TP system © 2020, M.T. Özsu & P. Valduriez 9
  • 10. Important Point Logically integrated but Physically distributed © 2020, M.T. Özsu & P. Valduriez 10
  • 11. Outline  Introduction   History    © 2020, M.T. Özsu & P. Valduriez 11
  • 12. History – File Systems © 2020, M.T. Özsu & P. Valduriez 12
  • 13. History – Database Management © 2020, M.T. Özsu & P. Valduriez 13
  • 14. History – Early Distribution © 2020, M.T. Özsu & P. Valduriez 14 Peer-to-Peer (P2P)
  • 15. History – Client/Server © 2020, M.T. Özsu & P. Valduriez 15
  • 16. History – Data Integration © 2020, M.T. Özsu & P. Valduriez 16
  • 17. History – Cloud Computing © 2020, M.T. Özsu & P. Valduriez 17 On-demand, reliable services provided over the Internet in a cost-efficient manner  Cost savings: no need to maintain dedicated compute power  Elasticity: better adaptivity to changing workload
  • 18. Data Delivery Alternatives  Delivery modes  Pull-only  Push-only  Hybrid  Frequency  Periodic  Conditional  Ad-hoc or irregular  Communication Methods  Unicast  One-to-many  Note: not all combinations make sense © 2020, M.T. Özsu & P. Valduriez 18
  • 19. Outline  Introduction    Distributed DBMS promises   © 2020, M.T. Özsu & P. Valduriez 19
  • 20. Distributed DBMS Promises  Transparent management of distributed, fragmented, and replicated data  Improved reliability/availability through distributed transactions  Improved performance  Easier and more economical system expansion © 2020, M.T. Özsu & P. Valduriez
  • 21. Transparency  Transparency is the separation of the higher-level semantics of a system from the lower level implementation issues.  Fundamental issue is to provide data independence in the distributed environment  Network (distribution) transparency  Replication transparency  Fragmentation transparency  horizontal fragmentation: selection  vertical fragmentation: projection  hybrid © 2020, M.T. Özsu & P. Valduriez
  • 22. Example © 2020, M.T. Özsu & P. Valduriez 22
  • 23. Transparent Access SELECT ENAME,SAL FROM EMP,ASG,PAY WHERE DUR > 12 AND EMP.ENO = ASG.ENO AND PAY.TITLE = EMP.TITLE Paris projects Paris employees Paris assignments Boston employees Montreal projects Paris projects New York projects with budget > 200000 Montreal employees Montreal assignments Boston Communication Network Montreal Paris New York Boston projects Boston employees Boston assignments Boston projects New York employees New York projects New York assignments Tokyo © 2020, M.T. Özsu & P. Valduriez 23
  • 24. Distributed Database - User View Distributed Database © 2020, M.T. Özsu & P. Valduriez 24
  • 25. Distributed DBMS - Reality Communication Subsystem DBMS Software User Application User Query DBMS Software DBMS Software DBMS Software User Query DBMS Software User Query User Application © 2020, M.T. Özsu & P. Valduriez 25
  • 26. Types of Transparency  Data independence  Network transparency (or distribution transparency)  Location transparency  Fragmentation transparency  Fragmentation transparency  Replication transparency © 2020, M.T. Özsu & P. Valduriez 26
  • 27. Reliability Through Transactions  Replicated components and data should make distributed DBMS more reliable.  Distributed transactions provide  Concurrency transparency  Failure atomicity • Distributed transaction support requires implementation of  Distributed concurrency control protocols  Commit protocols  Data replication  Great for read-intensive workloads, problematic for updates  Replication protocols © 2020, M.T. Özsu & P. Valduriez 27
  • 28. Potentially Improved Performance  Proximity of data to its points of use  Requires some support for fragmentation and replication  Parallelism in execution  Inter-query parallelism  Intra-query parallelism © 2020, M.T. Özsu & P. Valduriez 28
  • 29. Scalability  Issue is database scaling and workload scaling  Adding processing and storage power  Scale-out: add more servers  Scale-up: increase the capacity of one server → has limits © 2020, M.T. Özsu & P. Valduriez 29
  • 30. Outline  Introduction     Design issues  © 2020, M.T. Özsu & P. Valduriez 30
  • 31. Distributed DBMS Issues  Distributed database design  How to distribute the database  Replicated & non-replicated database distribution  A related problem in directory management  Distributed query processing  Convert user transactions to data manipulation instructions  Optimization problem  min{cost = data transmission + local processing}  General formulation is NP-hard © 2020, M.T. Özsu & P. Valduriez 31
  • 32. Distributed DBMS Issues  Distributed concurrency control  Synchronization of concurrent accesses  Consistency and isolation of transactions' effects  Deadlock management  Reliability  How to make the system resilient to failures  Atomicity and durability © 2020, M.T. Özsu & P. Valduriez 32
  • 33. Distributed DBMS Issues  Replication  Mutual consistency  Freshness of copies  Eager vs lazy  Centralized vs distributed  Parallel DBMS  Objectives: high scalability and performance  Not geo-distributed  Cluster computing © 2020, M.T. Özsu & P. Valduriez 33
  • 34. Related Issues  Alternative distribution approaches  Modern P2P  World Wide Web (WWW or Web)  Big data processing  4V: volume, variety, velocity, veracity  MapReduce & Spark  Stream data  Graph analytics  NoSQL  NewSQL  Polystores © 2020, M.T. Özsu & P. Valduriez 34
  • 35. Outline  Introduction      Distributed DBMS architecture © 2020, M.T. Özsu & P. Valduriez 35
  • 36. DBMS Implementation Alternatives © 2020, M.T. Özsu & P. Valduriez 36
  • 37. Dimensions of the Problem  Distribution  Whether the components of the system are located on the same machine or not  Heterogeneity  Various levels (hardware, communications, operating system)  DBMS important one  data model, query language,transaction management algorithms  Autonomy  Not well understood and most troublesome  Various versions  Design autonomy: Ability of a component DBMS to decide on issues related to its own design.  Communication autonomy: Ability of a component DBMS to decide whether and how to communicate with other DBMSs.  Execution autonomy: Ability of a component DBMS to execute local operations in any manner it wants to. © 2020, M.T. Özsu & P. Valduriez 37
  • 38. Client/Server Architecture © 2020, M.T. Özsu & P. Valduriez 38
  • 39. Advantages of Client-Server Architectures  More efficient division of labor  Horizontal and vertical scaling of resources  Better price/performance on client machines  Ability to use familiar tools on client machines  Client access to remote data (via standards)  Full DBMS functionality provided to client workstations  Overall better system price/performance © 2020, M.T. Özsu & P. Valduriez 39
  • 40. Database Server © 2020, M.T. Özsu & P. Valduriez 40
  • 41. Distributed Database Servers © 2020, M.T. Özsu & P. Valduriez 41
  • 42. Peer-to-Peer Component Architecture © 2020, M.T. Özsu & P. Valduriez 42
  • 43. MDBS Components & Execution © 2020, M.T. Özsu & P. Valduriez 43
  • 44. Mediator/Wrapper Architecture © 2020, M.T. Özsu & P. Valduriez 44
  • 45. Cloud Computing © 2020, M.T. Özsu & P. Valduriez 45 On-demand, reliable services provided over the Internet in a cost-efficient manner ‫التكلفة‬ ‫حيث‬ ‫من‬ ‫فعالة‬ ‫بطريقة‬ ‫اإلنترنت‬ ‫عبر‬ ‫مقدمة‬ ‫الطلب‬ ‫عند‬ ‫موثوقة‬ ‫خدمات‬  IaaS – Infrastructure-as-a-Service  PaaS – Platform-as-a-Service  SaaS – Software-as-a-Service  DaaS – Database-as-a-Service
  • 46. Simplified Cloud Architecture © 2020, M.T. Özsu & P. Valduriez 46