Ch07.ppt1. Distributed DBMS Page 10-12. 1
© 1998 M. Tamer Özsu & Patrick Valduriez
Outline
1. Introduction
2. Distributed Database Design
3. Overview of Query Processing
4. Query Decomposition and Data Localization
5. Distributed Transaction Management
6. Distributed Concurrency Control
7. Parallel Database Systems
Data placement and query processing
Load balancing
Database clusters
2. Distributed DBMS Page 10-12. 2
© 1998 M. Tamer Özsu & Patrick Valduriez
The Database Problem
Large volume of data use disk and large main
memory
I/O bottleneck (or memory access bottleneck)
Speed(disk) << speed(RAM) << speed(microprocessor)
Predictions
Moore’s law: processor speed growth (with multicore):
50 % per year
DRAM capacity growth : 4 every three years
Disk throughput : 2 in the last ten years
3. Distributed DBMS Page 10-12. 3
© 1998 M. Tamer Özsu & Patrick Valduriez
The Solution
Increase the I/O bandwidth
Data partitioning
Parallel data access
4. Distributed DBMS Page 10-12. 4
© 1998 M. Tamer Özsu & Patrick Valduriez
Multiprocessor Objectives
High-performance with better cost-performance
than mainframe or vector supercomputer
Use many nodes, each with good cost-performance,
communicating through network
The real challenge is to parallelize applications to
run with good load balancing
5. Distributed DBMS Page 10-12. 8
© 1998 M. Tamer Özsu & Patrick Valduriez
Parallel DBMS
Loose definition: a DBMS implemented on a
tighly coupled multiprocessor
Straighforward porting of relational DBMS
(the software vendor edge)
Naturally extends to distributed databases
with one server per site
6. Distributed DBMS Page 10-12. 9
© 1998 M. Tamer Özsu & Patrick Valduriez
Parallel DBMS - Objectives
High-performance through parallelism
High throughput with inter-query parallelism
Low response time with intra-operation parallelism
High availability and reliability by exploiting
data replication
Extensibility increasing database sizes or
increasing performance demands (e.g.,
throughput) should be easier.
Linear speed-up
Linear scale-up
7. Distributed DBMS Page 10-12. 10
© 1998 M. Tamer Özsu & Patrick Valduriez
Linear Speed-up
Linear increase in performance for a constant DB size
and proportional increase of the system components
(processor, memory, disk)
new perf.
old perf.
ideal
components
8. Distributed DBMS Page 10-12. 11
© 1998 M. Tamer Özsu & Patrick Valduriez
Linear Scale-up
Sustained performance for a linear increase of database size
and proportional increase of the system components.
components + database size
new perf.
old perf.
ideal
9. Distributed DBMS Page 10-12. 12
© 1998 M. Tamer Özsu & Patrick Valduriez
Parallel DBMS – Functional
Architecture
RM
task n
DM
task 12
DM
task n2
DM
task n1
Data Mgr
DM
task 11
Request Mgr
RM
task 1
Session Mgr
User
task 1
User
task n
10. Distributed DBMS Page 10-12. 13
© 1998 M. Tamer Özsu & Patrick Valduriez
Parallel DBMS Functions
Session manager
Host interface
Transaction monitoring for OLTP
Request manager
Query Compilation, and execution
To speed up query execution, it may optimize and
parallelize the query at compile-time.
Data manager
It provides all the low-level functions needed to run
compiled queries in parallel, i.e., database operator
execution, parallel transaction support, cache
management,
Transaction management support
Data management
11. Distributed DBMS Page 10-12. 14
© 1998 M. Tamer Özsu & Patrick Valduriez
Parallel System Architectures
One guiding design decision is the way the main hardware
elements, i.e., processors, main memory, and disks, are
connected through some fast interconnection network.
Multiprocessor architecture alternatives
Shared memory (SM)
Shared disk (SD)
Shared nothing (SN)
Hybrid architectures
12. Distributed DBMS Page 10-12. 15
© 1998 M. Tamer Özsu & Patrick Valduriez
Shared-Memory
DBMS on symmetric multiprocessors (SMP)
+ Simplicity, load balancing, fast communication
- Network cost, low extensibility, and low availability
All the processors are under the
control of a single operating
system.
13. Distributed DBMS Page 10-12. 16
© 1998 M. Tamer Özsu & Patrick Valduriez
Shared-Disk
+ network cost, extensibility, availability, and lower cost
- complexity, potential performance problem for cache
coherency
Each processor-memory node is
under the control of its own copy
of the operating system
14. Distributed DBMS Page 10-12. 17
© 1998 M. Tamer Özsu & Patrick Valduriez
Shared-Nothing
+ Extensibility, and availability
- Difficult load balancing (based on data location and not
the actual load of the system )
Each node can be viewed as a
local site (with its own database
and software) in a distributed
database system.
15. Distributed DBMS Page 10-12. 18
© 1998 M. Tamer Özsu & Patrick Valduriez
Hybrid Architectures
Various possible combinations of the three basic
architectures are possible to obtain different trade-
offs between cost, performance, extensibility,
availability, etc.
16. Distributed DBMS Page 10-12. 19
© 1998 M. Tamer Özsu & Patrick Valduriez
Parallel DBMS Techniques
Data placement
Physical placement of the DB onto multiple nodes
Static vs. Dynamic
Parallel data processing
Select is easy
Join (and all other non-select operations) is more difficult
Parallel query optimization
Choice of the best parallel execution plans
Automatic parallelization of the queries and load balancing
17. Distributed DBMS Page 10-12. 20
© 1998 M. Tamer Özsu & Patrick Valduriez
Data Partitioning
Since programs are executed where the data reside, data
placement is a critical performance issue.
Data placement must be done so as to maximize system
performance.
Each relation is divided in n partitions (subrelations),
where n is a function of relation size and access frequency
Implementation
Round-robin
Maps i-th element to node i mod n
Good for sequential relation access but not for exact-match queries
B-tree index (Range partitioning)
Supports range queries but large index
Hash function
Only exact-match queries but small index
18. Distributed DBMS Page 10-12. 21
© 1998 M. Tamer Özsu & Patrick Valduriez
Partitioning Schemes
Round-Robin Hashing
Interval
••• •••
•••
•••
•••
•••
•••
a-g h-m u-z
19. Distributed DBMS Page 10-12. 22
© 1998 M. Tamer Özsu & Patrick Valduriez
Parallel Data Processing
Two ways of exploiting high-performance multiprocessor
systems:
Automatically detect parallelism in sequential programs (e.g.,
Fortran, OPS5)
Augment an existing language with parallel constructs (e.g., C*,
Fortran90)
Critique
Hard to develop parallelizing compilers, limited resulting speed-
up
Enables the programmer to express parallel computations but too
low-level details
20. Distributed DBMS Page 10-12. 23
© 1998 M. Tamer Özsu & Patrick Valduriez
Data-based Parallelism
Inter-operation
p operations of the same query in parallel
op.3
op.1 op.2
op.
R
op.
R1
op.
R2
op.
R3
op.
R4
•Intra-operation
➡ The same op in parallel
➡ is based on the decomposition of one operator (e.g: SELECT) in a
set of independent sub-operators, called operator instances.
21. Distributed DBMS Page 10-12. 24
© 1998 M. Tamer Özsu & Patrick Valduriez
Join Processing
Three basic algorithms for intra-operator
parallelism
Parallel nested loop join: no special assumption
Parallel associative join: one relation is declustered on join
attribute and equi-join
Parallel hash join: equi-join
They also apply to other complex operators such as
duplicate elimination, union, intersection, etc. with
minor adaptation
22. Distributed DBMS Page 10-12. 25
© 1998 M. Tamer Özsu & Patrick Valduriez
1. Parallel Nested Loop Join
send
partition
node 3 node 4
node 1 node 2
R1:
S1 S2
R2:
23. Distributed DBMS Page 10-12. 26
© 1998 M. Tamer Özsu & Patrick Valduriez
2. Parallel Associative Join
node 1
node 3 node 4
node 2
R1: R2:
S1
S2
24. Distributed DBMS Page 10-12. 27
© 1998 M. Tamer Özsu & Patrick Valduriez
3. Parallel Hash Join
node node node node
node 1 node 2
R1: R2: S1: S2:
25. Distributed DBMS Page 10-12. 28
© 1998 M. Tamer Özsu & Patrick Valduriez
Parallel Query Optimization
The objective is to select the “best” parallel
execution plan for a query using the following
components
Search space
Models alternative execution plans as operator trees
Left-deep vs. Right-deep vs. Bushy trees
Search strategy
Dynamic programming for small search space
Randomized for large search space
Cost model (abstraction of execution system)
Is responsible for estimating the cost of a given
execution plan
Physical schema info. (partitioning, indexes, etc.)
Statistics and cost functions
26. Distributed DBMS Page 10-12. 29
© 1998 M. Tamer Özsu & Patrick Valduriez
Execution Plans as Operator Trees
R2
R1
R4
Result
j2
j3
Left-deep
Right-deep
j1 R3
R2
R1
R4
Result
j5
j6
j4
R3
R2
R1
R3
j7
R4
Result
j9
Zig-zag
j8
27. Distributed DBMS Page 10-12. 30
© 1998 M. Tamer Özsu & Patrick Valduriez
Load Balancing
Good load balancing is crucial for the performance of a
parallel system
The response time of a set of parallel operators is that
of the longest one.
Startup
The time needed to start a parallel operation may
dominate the actual computation time
Interference
When accessing shared resources, each new process
slows down the others (hot spot problem)
Skew
The response time of a set of parallel processes is
the time of the slowest one
28. Distributed DBMS Page 10-12. 31
© 1998 M. Tamer Özsu & Patrick Valduriez
Load Balancing
Problems arise for intra-operator parallelism
with skewed data distributions
attribute data skew (AVS)
e.g., there are more citizens in Paris than in Waterloo
tuple placement skew (TPS)
introduced when the data are initially partitioned (e.g.,
with range partitioning).
selectivity skew (SS)
redistribution skew (RS)
join product skew (JPS)