Distributed DBMS Page 10-12. 1
© 1998 M. Tamer Özsu & Patrick Valduriez
Outline
1. Introduction
2. Distributed Database Design
3. Overview of Query Processing
4. Query Decomposition and Data Localization
5. Distributed Transaction Management
6. Distributed Concurrency Control
7. Parallel Database Systems
 Data placement and query processing
 Load balancing
 Database clusters
Distributed DBMS Page 10-12. 2
© 1998 M. Tamer Özsu & Patrick Valduriez
The Database Problem
 Large volume of data  use disk and large main
memory
 I/O bottleneck (or memory access bottleneck)
 Speed(disk) << speed(RAM) << speed(microprocessor)
 Predictions
 Moore’s law: processor speed growth (with multicore):
50 % per year
 DRAM capacity growth : 4  every three years
 Disk throughput : 2  in the last ten years
Distributed DBMS Page 10-12. 3
© 1998 M. Tamer Özsu & Patrick Valduriez
The Solution
 Increase the I/O bandwidth
 Data partitioning
 Parallel data access
Distributed DBMS Page 10-12. 4
© 1998 M. Tamer Özsu & Patrick Valduriez
Multiprocessor Objectives
 High-performance with better cost-performance
than mainframe or vector supercomputer
 Use many nodes, each with good cost-performance,
communicating through network
 The real challenge is to parallelize applications to
run with good load balancing
Distributed DBMS Page 10-12. 8
© 1998 M. Tamer Özsu & Patrick Valduriez
Parallel DBMS
 Loose definition: a DBMS implemented on a
tighly coupled multiprocessor
 Straighforward porting of relational DBMS
(the software vendor edge)
 Naturally extends to distributed databases
with one server per site
Distributed DBMS Page 10-12. 9
© 1998 M. Tamer Özsu & Patrick Valduriez
Parallel DBMS - Objectives
 High-performance through parallelism
 High throughput with inter-query parallelism
 Low response time with intra-operation parallelism
 High availability and reliability by exploiting
data replication
 Extensibility increasing database sizes or
increasing performance demands (e.g.,
throughput) should be easier.
 Linear speed-up
 Linear scale-up
Distributed DBMS Page 10-12. 10
© 1998 M. Tamer Özsu & Patrick Valduriez
Linear Speed-up
Linear increase in performance for a constant DB size
and proportional increase of the system components
(processor, memory, disk)
new perf.
old perf.
ideal
components
Distributed DBMS Page 10-12. 11
© 1998 M. Tamer Özsu & Patrick Valduriez
Linear Scale-up
Sustained performance for a linear increase of database size
and proportional increase of the system components.
components + database size
new perf.
old perf.
ideal
Distributed DBMS Page 10-12. 12
© 1998 M. Tamer Özsu & Patrick Valduriez
Parallel DBMS – Functional
Architecture
RM
task n
DM
task 12
DM
task n2
DM
task n1
Data Mgr
DM
task 11
Request Mgr
RM
task 1
Session Mgr
User
task 1
User
task n
Distributed DBMS Page 10-12. 13
© 1998 M. Tamer Özsu & Patrick Valduriez
Parallel DBMS Functions
 Session manager
 Host interface
 Transaction monitoring for OLTP
 Request manager
 Query Compilation, and execution
 To speed up query execution, it may optimize and
parallelize the query at compile-time.
 Data manager
 It provides all the low-level functions needed to run
compiled queries in parallel, i.e., database operator
execution, parallel transaction support, cache
management,
 Transaction management support
 Data management
Distributed DBMS Page 10-12. 14
© 1998 M. Tamer Özsu & Patrick Valduriez
Parallel System Architectures
 One guiding design decision is the way the main hardware
elements, i.e., processors, main memory, and disks, are
connected through some fast interconnection network.
 Multiprocessor architecture alternatives
 Shared memory (SM)
 Shared disk (SD)
 Shared nothing (SN)
 Hybrid architectures
Distributed DBMS Page 10-12. 15
© 1998 M. Tamer Özsu & Patrick Valduriez
Shared-Memory
DBMS on symmetric multiprocessors (SMP)
+ Simplicity, load balancing, fast communication
- Network cost, low extensibility, and low availability
All the processors are under the
control of a single operating
system.
Distributed DBMS Page 10-12. 16
© 1998 M. Tamer Özsu & Patrick Valduriez
Shared-Disk
+ network cost, extensibility, availability, and lower cost
- complexity, potential performance problem for cache
coherency
Each processor-memory node is
under the control of its own copy
of the operating system
Distributed DBMS Page 10-12. 17
© 1998 M. Tamer Özsu & Patrick Valduriez
Shared-Nothing
+ Extensibility, and availability
- Difficult load balancing (based on data location and not
the actual load of the system )
Each node can be viewed as a
local site (with its own database
and software) in a distributed
database system.
Distributed DBMS Page 10-12. 18
© 1998 M. Tamer Özsu & Patrick Valduriez
Hybrid Architectures
 Various possible combinations of the three basic
architectures are possible to obtain different trade-
offs between cost, performance, extensibility,
availability, etc.
Distributed DBMS Page 10-12. 19
© 1998 M. Tamer Özsu & Patrick Valduriez
Parallel DBMS Techniques
 Data placement
 Physical placement of the DB onto multiple nodes
 Static vs. Dynamic
 Parallel data processing
 Select is easy
 Join (and all other non-select operations) is more difficult
 Parallel query optimization
 Choice of the best parallel execution plans
 Automatic parallelization of the queries and load balancing
Distributed DBMS Page 10-12. 20
© 1998 M. Tamer Özsu & Patrick Valduriez
Data Partitioning
 Since programs are executed where the data reside, data
placement is a critical performance issue.
 Data placement must be done so as to maximize system
performance.
 Each relation is divided in n partitions (subrelations),
where n is a function of relation size and access frequency
 Implementation
 Round-robin
 Maps i-th element to node i mod n
 Good for sequential relation access but not for exact-match queries
 B-tree index (Range partitioning)
 Supports range queries but large index
 Hash function
 Only exact-match queries but small index
Distributed DBMS Page 10-12. 21
© 1998 M. Tamer Özsu & Patrick Valduriez
Partitioning Schemes
Round-Robin Hashing
Interval
••• •••
•••
•••
•••
•••
•••
a-g h-m u-z
Distributed DBMS Page 10-12. 22
© 1998 M. Tamer Özsu & Patrick Valduriez
Parallel Data Processing
 Two ways of exploiting high-performance multiprocessor
systems:
 Automatically detect parallelism in sequential programs (e.g.,
Fortran, OPS5)
 Augment an existing language with parallel constructs (e.g., C*,
Fortran90)
 Critique
 Hard to develop parallelizing compilers, limited resulting speed-
up
 Enables the programmer to express parallel computations but too
low-level details
Distributed DBMS Page 10-12. 23
© 1998 M. Tamer Özsu & Patrick Valduriez
Data-based Parallelism
 Inter-operation
 p operations of the same query in parallel
op.3
op.1 op.2
op.
R
op.
R1
op.
R2
op.
R3
op.
R4
•Intra-operation
➡ The same op in parallel
➡ is based on the decomposition of one operator (e.g: SELECT) in a
set of independent sub-operators, called operator instances.
Distributed DBMS Page 10-12. 24
© 1998 M. Tamer Özsu & Patrick Valduriez
Join Processing
 Three basic algorithms for intra-operator
parallelism
 Parallel nested loop join: no special assumption
 Parallel associative join: one relation is declustered on join
attribute and equi-join
 Parallel hash join: equi-join
 They also apply to other complex operators such as
duplicate elimination, union, intersection, etc. with
minor adaptation
Distributed DBMS Page 10-12. 25
© 1998 M. Tamer Özsu & Patrick Valduriez
1. Parallel Nested Loop Join
send
partition
node 3 node 4
node 1 node 2
R1:
S1 S2
R2:
Distributed DBMS Page 10-12. 26
© 1998 M. Tamer Özsu & Patrick Valduriez
2. Parallel Associative Join
node 1
node 3 node 4
node 2
R1: R2:
S1
S2
Distributed DBMS Page 10-12. 27
© 1998 M. Tamer Özsu & Patrick Valduriez
3. Parallel Hash Join
node node node node
node 1 node 2
R1: R2: S1: S2:
Distributed DBMS Page 10-12. 28
© 1998 M. Tamer Özsu & Patrick Valduriez
Parallel Query Optimization
 The objective is to select the “best” parallel
execution plan for a query using the following
components
 Search space
 Models alternative execution plans as operator trees
 Left-deep vs. Right-deep vs. Bushy trees
 Search strategy
 Dynamic programming for small search space
 Randomized for large search space
 Cost model (abstraction of execution system)
 Is responsible for estimating the cost of a given
execution plan
 Physical schema info. (partitioning, indexes, etc.)
 Statistics and cost functions
Distributed DBMS Page 10-12. 29
© 1998 M. Tamer Özsu & Patrick Valduriez
Execution Plans as Operator Trees
R2
R1
R4
Result
j2
j3
Left-deep
Right-deep
j1 R3
R2
R1
R4
Result
j5
j6
j4
R3
R2
R1
R3
j7
R4
Result
j9
Zig-zag
j8
Distributed DBMS Page 10-12. 30
© 1998 M. Tamer Özsu & Patrick Valduriez
Load Balancing
 Good load balancing is crucial for the performance of a
parallel system
 The response time of a set of parallel operators is that
of the longest one.
 Startup
 The time needed to start a parallel operation may
dominate the actual computation time
 Interference
 When accessing shared resources, each new process
slows down the others (hot spot problem)
 Skew
 The response time of a set of parallel processes is
the time of the slowest one
Distributed DBMS Page 10-12. 31
© 1998 M. Tamer Özsu & Patrick Valduriez
Load Balancing
 Problems arise for intra-operator parallelism
with skewed data distributions
 attribute data skew (AVS)
 e.g., there are more citizens in Paris than in Waterloo
 tuple placement skew (TPS)
 introduced when the data are initially partitioned (e.g.,
with range partitioning).
 selectivity skew (SS)
 redistribution skew (RS)
 join product skew (JPS)
Distributed DBMS Page 10-12. 32
© 1998 M. Tamer Özsu & Patrick Valduriez
End of Chapter 6

Ch07.ppt

  • 1.
    Distributed DBMS Page10-12. 1 © 1998 M. Tamer Özsu & Patrick Valduriez Outline 1. Introduction 2. Distributed Database Design 3. Overview of Query Processing 4. Query Decomposition and Data Localization 5. Distributed Transaction Management 6. Distributed Concurrency Control 7. Parallel Database Systems  Data placement and query processing  Load balancing  Database clusters
  • 2.
    Distributed DBMS Page10-12. 2 © 1998 M. Tamer Özsu & Patrick Valduriez The Database Problem  Large volume of data  use disk and large main memory  I/O bottleneck (or memory access bottleneck)  Speed(disk) << speed(RAM) << speed(microprocessor)  Predictions  Moore’s law: processor speed growth (with multicore): 50 % per year  DRAM capacity growth : 4  every three years  Disk throughput : 2  in the last ten years
  • 3.
    Distributed DBMS Page10-12. 3 © 1998 M. Tamer Özsu & Patrick Valduriez The Solution  Increase the I/O bandwidth  Data partitioning  Parallel data access
  • 4.
    Distributed DBMS Page10-12. 4 © 1998 M. Tamer Özsu & Patrick Valduriez Multiprocessor Objectives  High-performance with better cost-performance than mainframe or vector supercomputer  Use many nodes, each with good cost-performance, communicating through network  The real challenge is to parallelize applications to run with good load balancing
  • 5.
    Distributed DBMS Page10-12. 8 © 1998 M. Tamer Özsu & Patrick Valduriez Parallel DBMS  Loose definition: a DBMS implemented on a tighly coupled multiprocessor  Straighforward porting of relational DBMS (the software vendor edge)  Naturally extends to distributed databases with one server per site
  • 6.
    Distributed DBMS Page10-12. 9 © 1998 M. Tamer Özsu & Patrick Valduriez Parallel DBMS - Objectives  High-performance through parallelism  High throughput with inter-query parallelism  Low response time with intra-operation parallelism  High availability and reliability by exploiting data replication  Extensibility increasing database sizes or increasing performance demands (e.g., throughput) should be easier.  Linear speed-up  Linear scale-up
  • 7.
    Distributed DBMS Page10-12. 10 © 1998 M. Tamer Özsu & Patrick Valduriez Linear Speed-up Linear increase in performance for a constant DB size and proportional increase of the system components (processor, memory, disk) new perf. old perf. ideal components
  • 8.
    Distributed DBMS Page10-12. 11 © 1998 M. Tamer Özsu & Patrick Valduriez Linear Scale-up Sustained performance for a linear increase of database size and proportional increase of the system components. components + database size new perf. old perf. ideal
  • 9.
    Distributed DBMS Page10-12. 12 © 1998 M. Tamer Özsu & Patrick Valduriez Parallel DBMS – Functional Architecture RM task n DM task 12 DM task n2 DM task n1 Data Mgr DM task 11 Request Mgr RM task 1 Session Mgr User task 1 User task n
  • 10.
    Distributed DBMS Page10-12. 13 © 1998 M. Tamer Özsu & Patrick Valduriez Parallel DBMS Functions  Session manager  Host interface  Transaction monitoring for OLTP  Request manager  Query Compilation, and execution  To speed up query execution, it may optimize and parallelize the query at compile-time.  Data manager  It provides all the low-level functions needed to run compiled queries in parallel, i.e., database operator execution, parallel transaction support, cache management,  Transaction management support  Data management
  • 11.
    Distributed DBMS Page10-12. 14 © 1998 M. Tamer Özsu & Patrick Valduriez Parallel System Architectures  One guiding design decision is the way the main hardware elements, i.e., processors, main memory, and disks, are connected through some fast interconnection network.  Multiprocessor architecture alternatives  Shared memory (SM)  Shared disk (SD)  Shared nothing (SN)  Hybrid architectures
  • 12.
    Distributed DBMS Page10-12. 15 © 1998 M. Tamer Özsu & Patrick Valduriez Shared-Memory DBMS on symmetric multiprocessors (SMP) + Simplicity, load balancing, fast communication - Network cost, low extensibility, and low availability All the processors are under the control of a single operating system.
  • 13.
    Distributed DBMS Page10-12. 16 © 1998 M. Tamer Özsu & Patrick Valduriez Shared-Disk + network cost, extensibility, availability, and lower cost - complexity, potential performance problem for cache coherency Each processor-memory node is under the control of its own copy of the operating system
  • 14.
    Distributed DBMS Page10-12. 17 © 1998 M. Tamer Özsu & Patrick Valduriez Shared-Nothing + Extensibility, and availability - Difficult load balancing (based on data location and not the actual load of the system ) Each node can be viewed as a local site (with its own database and software) in a distributed database system.
  • 15.
    Distributed DBMS Page10-12. 18 © 1998 M. Tamer Özsu & Patrick Valduriez Hybrid Architectures  Various possible combinations of the three basic architectures are possible to obtain different trade- offs between cost, performance, extensibility, availability, etc.
  • 16.
    Distributed DBMS Page10-12. 19 © 1998 M. Tamer Özsu & Patrick Valduriez Parallel DBMS Techniques  Data placement  Physical placement of the DB onto multiple nodes  Static vs. Dynamic  Parallel data processing  Select is easy  Join (and all other non-select operations) is more difficult  Parallel query optimization  Choice of the best parallel execution plans  Automatic parallelization of the queries and load balancing
  • 17.
    Distributed DBMS Page10-12. 20 © 1998 M. Tamer Özsu & Patrick Valduriez Data Partitioning  Since programs are executed where the data reside, data placement is a critical performance issue.  Data placement must be done so as to maximize system performance.  Each relation is divided in n partitions (subrelations), where n is a function of relation size and access frequency  Implementation  Round-robin  Maps i-th element to node i mod n  Good for sequential relation access but not for exact-match queries  B-tree index (Range partitioning)  Supports range queries but large index  Hash function  Only exact-match queries but small index
  • 18.
    Distributed DBMS Page10-12. 21 © 1998 M. Tamer Özsu & Patrick Valduriez Partitioning Schemes Round-Robin Hashing Interval ••• ••• ••• ••• ••• ••• ••• a-g h-m u-z
  • 19.
    Distributed DBMS Page10-12. 22 © 1998 M. Tamer Özsu & Patrick Valduriez Parallel Data Processing  Two ways of exploiting high-performance multiprocessor systems:  Automatically detect parallelism in sequential programs (e.g., Fortran, OPS5)  Augment an existing language with parallel constructs (e.g., C*, Fortran90)  Critique  Hard to develop parallelizing compilers, limited resulting speed- up  Enables the programmer to express parallel computations but too low-level details
  • 20.
    Distributed DBMS Page10-12. 23 © 1998 M. Tamer Özsu & Patrick Valduriez Data-based Parallelism  Inter-operation  p operations of the same query in parallel op.3 op.1 op.2 op. R op. R1 op. R2 op. R3 op. R4 •Intra-operation ➡ The same op in parallel ➡ is based on the decomposition of one operator (e.g: SELECT) in a set of independent sub-operators, called operator instances.
  • 21.
    Distributed DBMS Page10-12. 24 © 1998 M. Tamer Özsu & Patrick Valduriez Join Processing  Three basic algorithms for intra-operator parallelism  Parallel nested loop join: no special assumption  Parallel associative join: one relation is declustered on join attribute and equi-join  Parallel hash join: equi-join  They also apply to other complex operators such as duplicate elimination, union, intersection, etc. with minor adaptation
  • 22.
    Distributed DBMS Page10-12. 25 © 1998 M. Tamer Özsu & Patrick Valduriez 1. Parallel Nested Loop Join send partition node 3 node 4 node 1 node 2 R1: S1 S2 R2:
  • 23.
    Distributed DBMS Page10-12. 26 © 1998 M. Tamer Özsu & Patrick Valduriez 2. Parallel Associative Join node 1 node 3 node 4 node 2 R1: R2: S1 S2
  • 24.
    Distributed DBMS Page10-12. 27 © 1998 M. Tamer Özsu & Patrick Valduriez 3. Parallel Hash Join node node node node node 1 node 2 R1: R2: S1: S2:
  • 25.
    Distributed DBMS Page10-12. 28 © 1998 M. Tamer Özsu & Patrick Valduriez Parallel Query Optimization  The objective is to select the “best” parallel execution plan for a query using the following components  Search space  Models alternative execution plans as operator trees  Left-deep vs. Right-deep vs. Bushy trees  Search strategy  Dynamic programming for small search space  Randomized for large search space  Cost model (abstraction of execution system)  Is responsible for estimating the cost of a given execution plan  Physical schema info. (partitioning, indexes, etc.)  Statistics and cost functions
  • 26.
    Distributed DBMS Page10-12. 29 © 1998 M. Tamer Özsu & Patrick Valduriez Execution Plans as Operator Trees R2 R1 R4 Result j2 j3 Left-deep Right-deep j1 R3 R2 R1 R4 Result j5 j6 j4 R3 R2 R1 R3 j7 R4 Result j9 Zig-zag j8
  • 27.
    Distributed DBMS Page10-12. 30 © 1998 M. Tamer Özsu & Patrick Valduriez Load Balancing  Good load balancing is crucial for the performance of a parallel system  The response time of a set of parallel operators is that of the longest one.  Startup  The time needed to start a parallel operation may dominate the actual computation time  Interference  When accessing shared resources, each new process slows down the others (hot spot problem)  Skew  The response time of a set of parallel processes is the time of the slowest one
  • 28.
    Distributed DBMS Page10-12. 31 © 1998 M. Tamer Özsu & Patrick Valduriez Load Balancing  Problems arise for intra-operator parallelism with skewed data distributions  attribute data skew (AVS)  e.g., there are more citizens in Paris than in Waterloo  tuple placement skew (TPS)  introduced when the data are initially partitioned (e.g., with range partitioning).  selectivity skew (SS)  redistribution skew (RS)  join product skew (JPS)
  • 29.
    Distributed DBMS Page10-12. 32 © 1998 M. Tamer Özsu & Patrick Valduriez End of Chapter 6