• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Distributed_Database_System
 

Distributed_Database_System

on

  • 1,173 views

 

Statistics

Views

Total Views
1,173
Views on SlideShare
906
Embed Views
267

Actions

Likes
1
Downloads
27
Comments
0

24 Embeds 267

http://philipzhong.blogspot.com 160
http://philipzhong.blogspot.in 60
http://philipzhong.blogspot.it 5
http://philipzhong.blogspot.ca 5
http://philipzhong.blogspot.com.br 5
http://philipzhong.blogspot.tw 3
http://philipzhong.blogspot.kr 3
http://philipzhong.blogspot.co.uk 3
http://philipzhong.blogspot.co.il 3
http://philipzhong.blogspot.sg 3
http://philipzhong.blogspot.ru 2
http://philipzhong.blogspot.jp 2
http://philipzhong.blogspot.ro 2
http://philipzhong.blogspot.nl 1
http://philipzhong.blogspot.com.au 1
http://philipzhong.blogspot.fr 1
http://philipzhong.blogspot.cz 1
http://philipzhong.blogspot.com.ar 1
http://philipzhong.blogspot.ch 1
https://www.google.com 1
http://philipzhong.blogspot.ie 1
http://philipzhong.blogspot.pt 1
http://philipzhong.blogspot.de 1
http://philipzhong.blogspot.mx 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Distributed_Database_System Distributed_Database_System Presentation Transcript

    • Distributed Database Systems Distributed Database Systems Contents I 1 Motivation Distributed Database Systems 2 Detour on centralized query processing Translating SQL into relational algebra Distributed Query Processing Phases of centralized query processing Query parsing Katja Hose, Ralf Schenkel Query transformation Query optimization Max-Planck-Institut f¨r Informatik, Cluster of Excellence MMCI u 3 Basics of distributed query processing Phases of distributed query processing November 10, 2011 Introduction November 17, 2011 Meta data management Data localization 4 Global query optimization Main questions Katja Hose Distributed Database Systems November 10, 2011 1 / 167 Katja Hose Distributed Database Systems November 10, 2011 2 / 167Distributed Database Systems Distributed Database Systems Motivation Contents II Motivation Global query optimizer Distributed cost model The task of query processing is . . . Join order optimization . . . to answer user queries Total time models Response time models Example How many students are at Saarland University? Answer: 18.000 Additional constraints5 Summary Low response times High query throughput Efficient hardware usage ... Katja Hose Distributed Database Systems November 10, 2011 3 / 167 Katja Hose Distributed Database Systems November 10, 2011 4 / 167
    • Distributed Database Systems Distributed Database Systems Motivation Detour on centralized query processing Motivation 1 Motivation 2 Detour on centralized query processing Translating SQL into relational algebra Phases of centralized query processingDifferences to centralized query processing Query parsing Considering the physical data distribution during query optimization Query transformation Query optimization Considering communication costs 3 Basics of distributed query processingAssumptions Phases of distributed query processing Data is distributed among multiple nodes Introduction Existence of a global conceptual schema, which is used by all nodes Meta data management Data localization Queries are formulated on the global schema 4 Global query optimization Main questions Global query optimizer Distributed cost model Katja Hose Distributed Database Systems November 10, 2011 5 / 167 Katja Hose Distributed Database Systems November 10, 2011 6 / 167Distributed Database Systems Distributed Database Systems Detour on centralized query processing Detour on centralized query processing Translating SQL into relational algebra Join order optimization Translating SQL into relational algebra Total time models Response time models SQL query structure: select distinct a1 , . . . , an from R1 , . . . , Rn where p Algorithm:5 Summary 1 Translating the from clause Let R1 , . . . , Rk be the relations in the from clause of the query Construct expression: R1 if k = 1 R= ((. . . (R1 × R2 ) × . . . ) × Rk ) otherwise Katja Hose Distributed Database Systems November 10, 2011 7 / 167 Katja Hose Distributed Database Systems November 10, 2011 8 / 167
    • Distributed Database Systems Distributed Database Systems Detour on centralized query processing Detour on centralized query processing Translating SQL into relational algebra Translating SQL into relational algebra Translating SQL into relational algebra Translating SQL into relational algebraAlgorithm : Algorithm : 2 Translating the where clause 3 Translating the select clauseLet F be the predicate in the where clause of the query (if a where clause Let a1 , . . . , an (or “*”) be the projection in the select clause of the queryexists) Construct expression:Construct expression: W if the projection is “*” S= R if there is no where clause πa1 ,...,an (W ) otherwise W = σF (R) otherwise Output: S Katja Hose Distributed Database Systems November 10, 2011 9 / 167 Katja Hose Distributed Database Systems November 10, 2011 10 / 167Distributed Database Systems Distributed Database Systems Detour on centralized query processing Detour on centralized query processing Translating SQL into relational algebra Phases of centralized query processing Translating SQL into relational algebra Workflow for centralized query processingExample query select distinct e.EN ame, s.Salary from Employees e, Salary s where e.T itle = s.T itle and s.Salary ≥ 60.000 R1 if k = 1 R= ((. . . (R1 × R2 ) × . . . ) × Rk ) otherwise R = Employees × Salary R if there is no where clause W = σF (R) otherwise Katja Hose Distributed Database Systems November 10, 2011 11 / 167 Katja Hose Distributed Database Systems November 10, 2011 12 / 167
    • Distributed Database Systems Distributed Database Systems Detour on centralized query processing Detour on centralized query processing Query parsing Query parsing Query parsing ExampleTransform a declarative query into an internal representation Query formulated using a declarative query language, e.g., SQL Example The Parser translates the query into an internal representation Database managing information about employees and projects Called naive query plan Employees(EID, EN ame, T itle) Plan described by an operator tree of relational algebra operators Assignment(EN o, P N o, Duration) Query: return the names of all employees working for project ’P1’ SELECT EName FROM Employees e, Assignment a WHERE e.EID = ENo AND PNo=’P1’ Katja Hose Distributed Database Systems November 10, 2011 13 / 167 Katja Hose Distributed Database Systems November 10, 2011 14 / 167Distributed Database Systems Distributed Database Systems Detour on centralized query processing Detour on centralized query processing Query parsing Query parsing Example Operator tree πEN ame σP N o= P 1 ∧Employees.EID=Assignment.EN o Employees × AssignmentQuery SELECT EName FROM Employees e, Assignment a WHERE e.EID = ENo AND PNo=’P1’Translation into relational algebra πEN ame σP N o= P 1 ∧Employees.EID=Assignment.EN o Employees × AssignmentIn contrast to the SQL statement, the algebra statement already containsthe required basic evaluation operators Operator tree Katja Hose Distributed Database Systems November 10, 2011 15 / 167 Katja Hose Distributed Database Systems November 10, 2011 16 / 167
    • Distributed Database Systems Distributed Database Systems Detour on centralized query processing Detour on centralized query processing Query transformation Query transformation Workflow for centralized query processing Query transformation Steps 1 Name resolution Transforming object names into internal names 2 Semantic analysis Checking for global relations and attributes, view expansion, global access control 3 Normalization Transforming predicates into a canonical format 4 Simple algebraic rewriting Application of heuristics to eliminate bad plans Katja Hose Distributed Database Systems November 10, 2011 17 / 167 Katja Hose Distributed Database Systems November 10, 2011 18 / 167Distributed Database Systems Distributed Database Systems Detour on centralized query processing Detour on centralized query processing Query transformation Query transformation Semantic analysis Normalization Objective Check if the global schema defines all attributes and relations Simplification of the following optimization by transforming the query referenced in the query into a canonical format If the query is formulated on a view, replace references to Selection and join predicates relations/attributes with references to global relations/attributes Conjunctive normal form vs. disjunctive normal form Perform simple integrity checks, e.g., are the types of attributes Conjunctive normal form: used in comparison predicates of the same type? (p11 ∨ p12 ∨ · · · ∨ p1n ) ∧ · · · ∧ (pm1 ∨ pm2 ∨ · · · ∨ pmn ) Initial check if the query has the rights to access referenced Disjunctive normal form: (p11 ∧ p12 ∧ · · · ∧ p1n ) ∨ · · · ∨ (pm1 ∧ pm2 ∧ · · · ∧ pmn ) relations/attributes Transformation based on equivalence rules for logical operators Katja Hose Distributed Database Systems November 10, 2011 19 / 167 Katja Hose Distributed Database Systems November 10, 2011 20 / 167
    • Distributed Database Systems Distributed Database Systems Detour on centralized query processing Detour on centralized query processing Query transformation Query transformation Normalization Normalization Example SELECT ENameEquivalence rules FROM Employees e, Assignment a p1 ∧ p2 ⇐⇒ p2 ∧ p1 and p1 ∨ p2 ⇐⇒ p2 ∨ p1 WHERE e.EID = a.ENo AND Duration ≥ 3 AND (PNo=’P1’ OR PNo=’P2’) p1 ∧ (p2 ∧ p3 ) ⇐⇒ (p1 ∧ p2 ) ∧ p3 and p1 ∨ (p2 ∨ p3 ) ⇐⇒ (p1 ∨ p2 ) ∨ p3 p1 ∧ (p2 ∨ p3 ) ⇐⇒ (p1 ∧ p2) ∨ (p1 ∧ p3 ) and Selection condition in disjunctive normal form p1 ∨ (p2 ∧ p3 ) ⇐⇒ (p1 ∨ p2) ∧ (p1 ∨ p3 ) (EID = ENo ∧ Duration ≥ 3 ∧ PNo=’P1’) ∨ ¬(p1 ∧ p2 ) ⇐⇒ ¬p1 ∨ ¬p2 and ¬(p1 ∨ p2 ) ⇐⇒ ¬p1 ∧ ¬p2 (EID = ENo ∧ Duration ≥ 3 ∧ PNo=’P2’) ¬(¬p1 ) ⇐⇒ p1 Selection condition in conjunctive normal form EID = ENo ∧ Duration ≥ 3 ∧ (PNo=’P1’ ∨ PNo=’P2’) Katja Hose Distributed Database Systems November 10, 2011 21 / 167 Katja Hose Distributed Database Systems November 10, 2011 22 / 167Distributed Database Systems Distributed Database Systems Detour on centralized query processing Detour on centralized query processing Query transformation Query optimization Simple algebraic rewriting Workflow for centralized query processingSimple optimizations that are always beneficial regardless of system state Elimination of redundant predicates Simplification of expressions Unnesting of subqueries and viewsTasks Recognize and simplify all expressions/operations/subqueries that are “obviously” unnecessary, redundant, or contradictory. Do not consider system state information, e.g., size of tables, existence of indexes, etc. Katja Hose Distributed Database Systems November 10, 2011 23 / 167 Katja Hose Distributed Database Systems November 10, 2011 24 / 167
    • Distributed Database Systems Distributed Database Systems Detour on centralized query processing Detour on centralized query processing Query optimization Query optimization Query optimization HeuristicsSteps Use simple heuristics which usually lead to better performance 1 Algebraic optimization Not the optimal plan is needed, but the really bad ones should be Find a good relational algebra operator tree avoided Heuristic query optimization Heuristics Cost-based query optimization Statistical query optimization Break selections Complex selection criteria should be broken into multiple parts 2 Physical optimization Push projection and push selection Find suitable algorithms for implementing the operations Cheap selections and projections should be performed as early as possible to reduce the sizes of intermediate results Force joins In most cases, using a join is much cheaper than using a Cartesian product and a selection Katja Hose Distributed Database Systems November 10, 2011 25 / 167 Katja Hose Distributed Database Systems November 10, 2011 26 / 167Distributed Database Systems Distributed Database Systems Detour on centralized query processing Detour on centralized query processing Query optimization Query optimization Algebraic optimization rules Algebraic optimization rulesOperator is commutative: r1 r2 ⇐⇒ r2 r1 Combinations of selections σ can be combined using logical and (∧). TheOperator is associative: order of the selections is arbitrary: (r1 r2 ) r3 ⇐⇒ r1 (r2 r3 ) σF1 (σF2 (r1 )) ⇐⇒ σF1 ∧F2 (r1 ) ⇐⇒ σF2 (σF1 (r1 ))For operator π in combination with another operator π, the “outer” Exploiting commutativity of ∧parameter dominates the “inner” one: πX (πY (r1 )) ⇐⇒ πX (r1 ) if X ⊆ Y Katja Hose Distributed Database Systems November 10, 2011 27 / 167 Katja Hose Distributed Database Systems November 10, 2011 28 / 167
    • Distributed Database Systems Distributed Database Systems Detour on centralized query processing Detour on centralized query processing Query optimization Query optimization Algebraic optimization rules Algebraic optimization rules Operators σ and commute if all selection attributes are contained in the same relation:Operators π and σ commute if predicate F is defined based on the σF (r1 r2 ) ⇐⇒ σF (r1 ) r2 if attr(F ) ⊆ R1projection attributes: A selection predicate can be split up in conjunction with a join (F = F1 ∧ F2 ) if the attributes referred to by F1 and F2 are contained in different relations: σF (πX (r1 )) ⇐⇒ πX (σF (r1 )) if attr(F ) ⊆ X σF (r1 r2 ) ⇐⇒ σF1 (r1 ) σF2 (r2 )Alternatively, change in ordering possible if the projection is extended by if attr(F1 ) ⊆ R1 and attr(F2 ) ⊆ R2all necessary attributes: In any case, part of a selection can be split up by separating predicates F1 πX1 (σF (r1 )) ⇐⇒ πX1 (σF (πX1 ,X2 (r1 ))) if attr(F ) ⊇ X2 referencing attributes of R1 only, F2 contains the remaining predicates referencing attributes of both relations σF (r1 r2 ) ⇐⇒ σF2 (σF1 (r1 ) r2 ) if attr(F1 ) ⊆ R1 Katja Hose Distributed Database Systems November 10, 2011 29 / 167 Katja Hose Distributed Database Systems November 10, 2011 30 / 167Distributed Database Systems Distributed Database Systems Detour on centralized query processing Detour on centralized query processing Query optimization Query optimization Algebraic optimization rules Algebraic optimization rulesCommutativity of σ and ∪: Commutativity of π and : σF (r1 ∪ r2 ) ⇐⇒ σF (r1 ) ∪ σF (r2 ) πX (r1 r2 ) ⇐⇒ πX (πY1 (r1 ) πY2 (r2 ))Commutativity of σ and −: with Y1 = (X ∩ R1 ) ∪ (R1 ∩ R2 ) σF (r1 − r2 ) ⇐⇒ σF (r1 ) − σF (r2 ) andor in case F only references tuples in r1 : Y2 = (X ∩ R2 ) ∪ (R1 ∩ R2 ) σF (r1 − r2 ) ⇐⇒ σF (r1 ) − r2 Pushing a projection is possible if all Yi are defined in such a way that they preserve all attributes necessary to perform the join. Katja Hose Distributed Database Systems November 10, 2011 31 / 167 Katja Hose Distributed Database Systems November 10, 2011 32 / 167
    • Distributed Database Systems Distributed Database Systems Detour on centralized query processing Detour on centralized query processing Query optimization Query optimization Algebraic optimization rules Heuristic algebraic optimization – ExampleFurther rules Commutativity of π and ∪: πX (r1 ∪ r2 ) ⇐⇒ πX (r1 ) ∪ πX (r2 ) Use algebraic optimization heuristics Distributive law for and ∪, distributive law for and −, Commutativity of renaming β with other operators, . . . Force join Idempotence, e.g., A ∨ A ⇐⇒ A Push selection and projection Operations involving empty relations Commutative and associative laws for , ∪ und ∩ Katja Hose Distributed Database Systems November 10, 2011 33 / 167 Katja Hose Distributed Database Systems November 10, 2011 34 / 167Distributed Database Systems Distributed Database Systems Detour on centralized query processing Detour on centralized query processing Query optimization Query optimization Cost-based algebraic query optimization Physical query optimization Physical optimizationMost non-distributed RDBMS strongly rely on cost-based optimizations Input: Aim for better optimized plan with respect to system and data Optimized query plan consisting of algebra operators characteristics Choose an algorithm to compute a particular algebra operator Join order optimization Join: Basic approach Block-Nested-Loop join, hash join, merge join, . . . Establish a cost model for various operations Enumerate all query plans and compute costs Select: Pick the best query plan Full table scan, index lookup, ad-hoc index generation & lookup, . . . Usually, dynamic programming techniques are used to keep Tasks computational effort manageable Translating a query plan into an execution plan Physical and algebraic optimization are often interleaved Katja Hose Distributed Database Systems November 10, 2011 35 / 167 Katja Hose Distributed Database Systems November 10, 2011 36 / 167
    • Distributed Database Systems Distributed Database Systems Detour on centralized query processing Basics of distributed query processing Query optimization Query optimization example 1 Motivation 2 Detour on centralized query processing Translating SQL into relational algebra Phases of centralized query processingOutput: query execution plan Query parsing Query transformation Query optimization 3 Basics of distributed query processing Phases of distributed query processing Introduction Meta data management Data localization 4 Global query optimization Main questions Global query optimizer Distributed cost model Katja Hose Distributed Database Systems November 10, 2011 37 / 167 Katja Hose Distributed Database Systems November 10, 2011 38 / 167Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Phases of distributed query processing Join order optimization Workflow for distributed query processing Total time models Response time models5 Summary Katja Hose Distributed Database Systems November 10, 2011 39 / 167 Katja Hose Distributed Database Systems November 10, 2011 40 / 167
    • Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Introduction Introduction Basic considerations Basic considerations Costs are more difficult to predictDistributed query processing Join selectivity: is it worthwhile to push down a selection? Shares the same properties of centralized query processing Data is distributed: difficult to get meaningful statistics Similar problem but with different objectives and constraints Network latency is very hard to predictObjectives for centralized query processing Current workload at nodes, load shedding Minimize the number of disk accesses Additional cost factors and constraints Minimize computational time Extension of relational algebra (sending/receiving data)Objectives for distributed query processing Data localization (which node holds relevant data) Minimize resource consumption Replication and caching (where to compute an operation) Minimize response time Network models Maximize throughput Response-time models Data and structural heterogeneity (federated databases . . . ) Katja Hose Distributed Database Systems November 10, 2011 41 / 167 Katja Hose Distributed Database Systems November 10, 2011 42 / 167Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Introduction Introduction Consequences Example QueryOptimization is much more difficult than in the central case Return the names of all employees working for project ’P1’ Statistics and costs change over time, e.g., workload at a node, network load πEN ame (πEID,EN ame (Employees) Employees.EID=Assignment.EN o πEN o (σP N o= P 1 (Assignment))) More conflicting optimization goals Increase throughput → reduce replication and parallelization, Problems increase query response time → increase parallelization Relations are fragmented and distributed among five nodes More cost factors and constraints The Employees relation uses primary horizontal fragmentationConsequences One fragment located at node 1, the other at node 2, no replication Adaptive query plans (create an initial plan and optimize it on-the-fly) The Assignment relation uses derived horizontal fragmentation One fragment located at node 3, the other at node 4, no replication Do not aim for the best plan, but for a good plan The query originates from node 5 Katja Hose Distributed Database Systems November 10, 2011 43 / 167 Katja Hose Distributed Database Systems November 10, 2011 44 / 167
    • Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Introduction Introduction Example Example Cost model and statistics Accessing a tuple costs 1 unit (acc) Transferring a tuple costs 10 units (trans) There are 400 employees and 1000 assignments 20 assignments for project ‘P1’ All tuples are uniformly distributed, i.e., nodes 3 and 4 provide 10 assignments for project ‘P1’ each There are local indexes on attribute P N o at nodes 3 and 4 (as well as indexes on primary keys at all nodes) Direct tuple access is possible on local sites, no scanning All nodes can directly communicate with each other Simplification: no costs for unions and projections Katja Hose Distributed Database Systems November 10, 2011 45 / 167 Katja Hose Distributed Database Systems November 10, 2011 46 / 167Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Introduction Introduction Example Example Simple execution plan - Version BSimple execution plan - Version A Ship intermediate resultsTransfer all data to Node 5 Katja Hose Distributed Database Systems November 10, 2011 47 / 167 Katja Hose Distributed Database Systems November 10, 2011 48 / 167
    • Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Introduction Introduction Example Example Costs plan B: 440 unitsCosts plan A: 23.000 units Katja Hose Distributed Database Systems November 10, 2011 49 / 167 Katja Hose Distributed Database Systems November 10, 2011 50 / 167Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Introduction Introduction Important aspects of distributed query processing Important aspects of distributed query processing Meta data management Data localization Global query optimization Post-processing Katja Hose Distributed Database Systems November 10, 2011 51 / 167 Katja Hose Distributed Database Systems November 10, 2011 52 / 167
    • Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Meta data management Meta data management Workflow for distributed query processing Meta data management Prerequisites to perform query optimization Meta data must be available Meta data is stored in the catalog Catalog provides information about the data distribution Use this information to decide, for instance, if it is worthwhile to execute a selection very early. Katja Hose Distributed Database Systems November 10, 2011 53 / 167 Katja Hose Distributed Database Systems November 10, 2011 54 / 167Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Meta data management Meta data management Meta data management Meta data managementTypical contents of a catalog for distributed database management systems Database schema Where to store the catalog in a distributed system? Definitions of tables, views, constraints, keys,. . . Central node Partitioning schema Simple solution, bottleneck Information about how the schema is partitioned and how tables can Replicated at all nodes be reconstructed Updates are expensive Allocation schema Fragmented Information about which fragment can be found at which node In rare cases, the catalog may become very large (including information about replication) Catalog has to be fragmented and allocated Network information Caching Information about node connections, network model Replicate only needed parts of a central catalog, anticipate potential Additional physical information inconsistencies Information about indexes, data statistics (histograms, etc.), hardware resources (processing & storage),. . . Katja Hose Distributed Database Systems November 10, 2011 55 / 167 Katja Hose Distributed Database Systems November 10, 2011 56 / 167
    • Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Meta data management Meta data management Meta data management Meta data managementCentralized catalog Replicated catalog One instance of the global catalog at a central node Full copy of the global catalog at each node Advantages Advantages No need to update copies Little communication overhead for queries Little memory consumption Good availability Disadvantages Disadvantages Communication with central node for each query High update costs Central node potentially represents a bottleneck Katja Hose Distributed Database Systems November 10, 2011 57 / 167 Katja Hose Distributed Database Systems November 10, 2011 58 / 167Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Meta data management Meta data management Meta data management Meta data managementFragmented catalog Caching catalog data Partitioning the global catalog and assigning partitions to nodes Caching non-local catalog data Advantages Advantages Sharing load among nodes Avoiding remote access to frequently needed catalog data Reducing update overhead Reducing communication overhead Disadvantages Disadvantages Localizing necessary partitions of the global catalog Coherency control Invalidating cached copies in the presence of updates Katja Hose Distributed Database Systems November 10, 2011 59 / 167 Katja Hose Distributed Database Systems November 10, 2011 60 / 167
    • Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Meta data management Data localization Meta data management Workflow for distributed query processingCaching catalog data Explicit invalidation Owner of catalog data remembers nodes with local copies In case of updates: sending an invalidation message to nodes with local copies Implicit invalidation Identifying old catalog data during runtime (adding version numbers and time stamps to query messages) Katja Hose Distributed Database Systems November 10, 2011 61 / 167 Katja Hose Distributed Database Systems November 10, 2011 62 / 167Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Data localization Data localization Data localization Example – horizontal reductionObjective Schema Creating subqueries in consideration of the data distribution Projects1 = σBudget≤150.000 (Projects) Projects2 = σ150.000<Budget≤200.000 (Projects)Assumptions Projects3 = σBudget>200.000 (Projects) Fragmentation is defined by fragmentation expressions Reconstruction expression (horizontal fragmentation) Each fragment is allocated only at one node (no replication) Projects = Projects1 ∪ Projects2 ∪ Projects3 Fragmentation expressions and locations of the fragments are stored Example query in the catalog σLocation= Saarbr. ∧Budget≤100.000 (Projects)Main tasks After replacing references to global relations Replace access to global relations with accesses to the fragments σLocation= Saarbr. ∧Budget≤100.000 (Projects1 ∪ Projects2 ∪ Insert reconstruction expression into algebra query Projects3 ) Basic algebraic simplifications of the query Further optimization is possible! Katja Hose Distributed Database Systems November 10, 2011 63 / 167 Katja Hose Distributed Database Systems November 10, 2011 64 / 167
    • Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Data localization Data localization Query simplification – horizontal reduction Example – horizontal reductionObjective Query with fragmentation expression Eliminate non-necessary subqueries σLocation= Saarbr. ∧Budget≤100.000 (Projects1 ∪ Projects2 ∪ Projects3 )Horizontal reduction rule Fragment definitions Projects1 = σBudget≤150.000 (Projects) Given fragments of R as FR = {R1 , . . . , Rn } with Ri = σpi (R) Projects2 = σ150.000<Budget≤200.000 (Projects) All fragments Ri for which σps (Ri ) = ∅ can be removed Projects3 = σBudget>200.000 (Projects) with ps denoting the query’s selection predicate Because of σps (Ri ) = ∅ ⇐ ∀x ∈ R : ¬(ps (x) ∧ (pi (x)) σBudget≤100.000 (Projects2 ) = ∅, σBudget≤100.000 (Projects3 ) = ∅ The selection with the query predicate ps on fragment Ri is empty if ps contradicts the fragmentation predicate pi of Ri , i.e., ps and pi are We obtain the reduced query never true at the same time for all tuples in Ri σLocation= Saarbr. (σBudget≤100.000 (Projects1 )) Katja Hose Distributed Database Systems November 10, 2011 65 / 167 Katja Hose Distributed Database Systems November 10, 2011 66 / 167Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Data localization Data localization Query simplification – join reduction Example – join reductionJoin Reductions Schema Larger joins are replaced by multiple partial joins on fragments Projects(PNo, PName, Budget, Location) Distributive law: (R1 ∪ R2 ) S = (R1 S) ∪ (R2 S) Projects1 = σP N o= P 1 ∨P N o= P 2 (Projects) Projects2 = σP N o= P 3 (Projects) Eliminate all union fragments that will return an empty result Projects3 = σP N o= P 4 (Projects)Expectations Assignment(ENo, PNo, Duration) Elimination of partial joins producing empty results Assignment1 = σP N o= P 1 ∨P N o= P 2 (Assignment) Depends on fragmentation optimality Assignment2 = σP N o= P 3 ∨P N o= P 4 (Assignment) Many joins on small relations have lower resource costs than one large Example query join Depends on fragmentation and applied join algorithms select * from Projects p, Assignment a where p.PNo = a.PNo Smaller joins can be executed in parallel In relational algebra Might decrease response time but might also increase communication Projects Assignment costs Katja Hose Distributed Database Systems November 10, 2011 67 / 167 Katja Hose Distributed Database Systems November 10, 2011 68 / 167
    • Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Data localization Data localization Example – join reduction Query simplification – join reductionQuery Projects Assignment Join reduction rule Given fragments of R as FR = {R1 , . . . , Rn } and fragments of S asAfter replacing global relations with reconstruction expressions FS = {S1 , . . . , Sn } (Projects1 ∪ Projects2 ∪ Projects3 ) (Assignment1 ∪ Assignment2 ) Apply distributive law, e.g.: (R1 ∪ R2 ) (S1 ∪ S2 ) = (R1 S1 ) ∪ (R1 S2 ) ∪ (R2 S1 ) ∪ (R2 S2 )After applying the distributive law All partial joins between fragments Ri and Sj for which Ri Sj = ∅ can be removed (Projects1 Assignment1 ) ∪ (Projects1 Assignment2 ) ∪ Ri Sj = ∅ ⇐ ∀x ∈ Ri , y ∈ Sj : ¬(pi (x) ∧ pj (y)) (Projects2 Assignment1 ) ∪ (Projects2 Assignment2 ) ∪ The join between fragments Ri and Rj is empty if their respective (Projects3 Assignment1 ) ∪ (Projects3 Assignment2 ) fragmentation predicates (on the join attribute) contradict, i.e., there is no tuple combination x and y such that both partitioning Further optimization is possible! predicates are fulfilled at the same time. Katja Hose Distributed Database Systems November 10, 2011 69 / 167 Katja Hose Distributed Database Systems November 10, 2011 70 / 167Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Data localization Data localization Example – join reduction Query simplification – join reduction for horizontal fragmentationQuery with fragmentation expression (Projects1 Assignment1 ) ∪ (Projects1 Assignment2 ) ∪ The easiest join reduction case follows from derived horizontal (Projects2 Assignment1 ) ∪ (Projects2 Assignment2 ) ∪ fragmentation (Projects3 Assignment1 ) ∪ (Projects3 Assignment2 ) For each fragment of the first relation, there is exactly one matching fragment of the second relationSome of these partial joins are empty, e.g.: Simply use the information contained in the reconstruction expression Projects1 Assignment2 = ∅ instead of comparing the reconstruction predicates to each otherBecause their fragmentation expressions contradict: Join reduction for arbitrary horizontal partitioning might not be beneficial Projects1 = σP N o= P 1 ∨P N o= P 2 (Projects) and Assignment2 = σP N o= P 3 ∨P N o= P 4 (Assignment)Reduced query (Projects1 Assignment1 ) ∪ (Projects2 Assignment2 ) ∪ (Projects3 Assignment2 ) Katja Hose Distributed Database Systems November 10, 2011 71 / 167 Katja Hose Distributed Database Systems November 10, 2011 72 / 167
    • Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Data localization Data localization Query simplification – join reduction for derived Query simplification – join reduction for derived horizontal fragmentation horizontal fragmentationExample After replacing global relations with reconstruction expressions Projects(PNo, PName, Budget, Location) (Projects1 ∪ Projects2 ) (Assignment1 ∪ Assignment2 ) Projects1 = σP N o= P 1 ∨P N o= P 2 (Projects) Projects2 = σP N o= P 3 ∨P N o= P 4 (Projects) After applying the distributive law Assignment(ENo, PNo, Duration) (Projects1 Assignment1 ) ∪ (Projects1 Assignment2 ) ∪ Assignment1 = Assignment Projects1 (Projects2 Assignment1 ) ∪ (Projects2 Assignment2 ) Assignment2 = Assignment Projects2 Reduced query (using information about fragmentation of relation Assignment directly)Query in relational algebra Projects Assignment (Projects1 Assignment1 ) ∪ (Projects2 Assignment2 ) Katja Hose Distributed Database Systems November 10, 2011 73 / 167 Katja Hose Distributed Database Systems November 10, 2011 74 / 167Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Data localization Data localization Query simplification – vertical reduction Example – vertical reduction Schema Projects(PNo, PName, Budget, Location) Projects1 = πP N o,P N ame,Location (Projects) Projects2 = πP N o,Budget (Projects)Vertical fragmentation rule Reconstruction expression Given fragments of R as FR = {R1 , . . . , Rn } with Ri = πβi (R) with Projects = Projects1 Projects2 βi representing the enumeration of a subset of R’s attributes Avoid joining fragments containing “useless” attributes, i.e., Example query fragments containing only attributes that are not referenced in the πP N ame (Projects) query and not output in the result After replacing references to global relations πP N ame (Projects1 Projects2 ) After removing unnecessary fragments πP N ame (Projects1 ) Katja Hose Distributed Database Systems November 10, 2011 75 / 167 Katja Hose Distributed Database Systems November 10, 2011 76 / 167
    • Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Data localization Data localization Query simplification – hybrid fragmentation Qualified relations Supporting algebraic optimization of queries involving fragments Annotating fragments and intermediate relations with predicates Estimating the size of a relation The reconstruction expression introduces combinations of joins and Extension of relational algebra unions General guidelines Definition: qualified relation Remove empty relations generated by contradicting relations on A qualified relation is a pair [R : qR ] where R is a relation and qR is a horizontal fragments predicate. Remove useless relations generated by vertical fragments Break and distribute joins, eliminate empty fragment joins Example Representing horizontal fragments as qualified relations where the qualification predicate corresponds to the fragmentation expression [Projects : σP N o= P 1 ∨P N o= P 2 ] Katja Hose Distributed Database Systems November 10, 2011 77 / 167 Katja Hose Distributed Database Systems November 17, 2011 78 / 167Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Data localization Data localization Qualified relations Qualified relations Example query σ100.000≤Budget≤200.000 (Projects)Extended relational algebra Qualified relations E1 = σ100.000≤Budget≤200.000 [Projects1 : Budget ≤ 150.000](1) E := σF [R : qR ] → [E : F ∧ qR ] [E1 : (100.000 ≤ Budget ≤ 200.000) ∧ (Budget ≤ 150.000)](2) E := πA [R : qR ] → [E : qR ] [E1 : 100.000 ≤ Budget ≤ 150.000](3) E := [R : qR ] × [S : qS ] → [E : qR ∧ qS ](4) E := [R : qR ] − [S : qS ] → [E : qR ] E2 = σ1000≤Budget≤200.000 [Projects2 : 150.000 < Budget ≤ 200.000](5) E := [R : qR ] ∪ [S : qS ] → [E : qR ∨ qS ] [E2 : (100.000 ≤ Budget ≤ 200.000) ∧(6) E := [R : qR ] F [S : qS ] → [E : qR ∧ qS ∧ F ] (150.000 < Budget ≤ 200.000)] [E2 : 150.000 < Budget ≤ 200.000] E3 = σ100.000≤Budget≤200.000 [Projects3 : Budget > 200.000] [E3 : (100.000 ≤ Budget ≤ 200.000) ∧ (Budget > 200.000)] E3 = ∅ Katja Hose Distributed Database Systems November 17, 2011 79 / 167 Katja Hose Distributed Database Systems November 17, 2011 80 / 167
    • Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization1 Motivation Join order optimization Total time models2 Detour on centralized query processing Response time models Translating SQL into relational algebra Phases of centralized query processing Query parsing Query transformation Query optimization3 Basics of distributed query processing Phases of distributed query processing Introduction 5 Summary Meta data management Data localization4 Global query optimization Main questions Global query optimizer Distributed cost model Katja Hose Distributed Database Systems November 17, 2011 81 / 167 Katja Hose Distributed Database Systems November 17, 2011 82 / 167Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Main questions Main questions Workflow for distributed query processing Introduction to global query optimization Main questions When to optimize? What criteria to optimize? Where to execute the query? Katja Hose Distributed Database Systems November 17, 2011 83 / 167 Katja Hose Distributed Database Systems November 17, 2011 84 / 167
    • Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Main questions Main questions When to optimize? When to optimize?Full compile time optimization Fully dynamic optimization The full query execution plan is computed at compile time Each query is optimized individually at runtime Assumption This technique heavily relies on heuristics, learning algorithms, and Applications use canned queries luck Prepared and parameterized SQL statements Pros Pros Might produce very good plans Queries can be executed directly Uses current network state Cons Also usable for ad-hoc queries Complex to model Cons Much information unknown or too expensive to gather Result quality might be very unpredictable Collecting statistics on all nodes? Complex algorithms and heuristics Statistics outdated Difficult to keep statistics up-to-date Especially machine load and network properties are very volatile Katja Hose Distributed Database Systems November 17, 2011 85 / 167 Katja Hose Distributed Database Systems November 17, 2011 86 / 167Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Main questions Main questions When to optimize? When to optimize?Semi-dynamic optimization Hierarchical optimization Pre-optimize the query Plans are created in multiple stages During query execution, test if execution runs as expected during Global-Local-Plans optimization Global query optimizer creates a global query plan e.g., are tuples/fragments delivered in time?, does the network adhere Focus on data transfer: which intermediate results are to be computed by which node? How should intermediate results be shipped? to the predicted properties?, are there any bad network latencies?, etc. Local query optimizers create local query plans If execution shows severe deviations, compute a new query plan for all Decide on query plan layout, algorithms, indexes, etc. to deliver the parts that have not yet been executed requested intermediate resultMakes only sense for queries that run for a longer time Two-Step-Plans Katja Hose Distributed Database Systems November 17, 2011 87 / 167 Katja Hose Distributed Database Systems November 17, 2011 88 / 167
    • Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Main questions Main questions When to optimize? What criteria to optimize?Hierarchical optimization Important aspects for global optimization Plans are created in multiple stages Communication operators Global-Local-Plans Two-Step-Plans Fragment cardinalities During compile time, only stable parts of the plan are computed Order of operations Join order, join methods, access paths, etc. Join ordering During query execution, all missing plan elements are added Because permutations of the joins within the query may lead to Node selection, transfer policies, etc. Both steps can be performed using traditional query optimization improvements of orders of magnitude techniques Most important alternative optimization criteria Plan enumeration with dynamic programming Complexity is manageable as each optimization problem is much easier Query response time than a full optimization Resource consumption During runtime optimization, fresh statistics are available Total query execution costsMost distributed database management systems use semi-dynamic orhierarchical optimization techniques (or both) ... Katja Hose Distributed Database Systems November 17, 2011 89 / 167 Katja Hose Distributed Database Systems November 17, 2011 90 / 167Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Main questions Main questions Where to execute the query? Global query optimization Global query optimization. . . Query optimizer has to decide which parts of the query have to be . . . deals with finding the “best” ordering of operations in the query shipped to which node (cost model) (extended by fragmentation expressions and including communication operations) that minimizes a cost function. In heavily replicated scenarios, clever hybrid shipping can effectively be used for load balancing Input Move expensive computations to lightly loaded nodes, avoid an algebraic query extended by fragmentation expressions expensive communication Output an algebraic query or query execution plan with communication operations Katja Hose Distributed Database Systems November 17, 2011 91 / 167 Katja Hose Distributed Database Systems November 17, 2011 92 / 167
    • Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Global query optimizer Global query optimizer Basics of global query optimization Optimizer componentsObjective The global optimizer has three main components Choose a cost efficient execution plan based on the algebraic query plan given as input The search space Decide which parts of the query have to be transferred to which node Set of alternative equivalent execution plans to represent the input queryPrerequisites The cost model Knowledge about fragmentation Predicts the costs of a given query execution plan Knowledge about fragment/relation sizes The search strategy Knowledge about data distribution Explores the search space and selects the best plan Knowledge about costs of operations Katja Hose Distributed Database Systems November 17, 2011 93 / 167 Katja Hose Distributed Database Systems November 17, 2011 94 / 167Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Global query optimizer Global query optimizer Phases of optimization Search space QueryPhases SELECT EName, Title FROM Employees e, Assignment a, Project p 1 Spanning the search space using WHERE e.EID = ENo AND a.PNo=p.PNo transformation rules → equivalent search plans Equivalent join trees 2 Applying a search strategy and a cost model → choose an efficient planMain focus: join trees and joinordering O(N !) different join trees by applying commutativity and associativity rules for N relations Katja Hose Distributed Database Systems November 17, 2011 95 / 167 Katja Hose Distributed Database Systems November 17, 2011 96 / 167
    • Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Global query optimizer Global query optimizer Search space Search strategiesTree variants for join order optimization Linear join trees All inner nodes have at least one leaf node (base relation) as child A search strategy needs to reduce search space Reduces search space Bushy trees Applying heuristics (similar to centralized algebraic optimization) May have inner nodes with no base relation as child Perform projections and selections when accessing base relations High potential for parallelization Avoid Cartesian products – enforce joins Applying further heuristics influencing the shape of the join tree ⊲⊳ Reducing the size of the search space vs. exhibiting parallelism ⊲⊳ Linear vs. bushy trees ⊲⊳ R1 ⊲⊳ ⊲⊳ ⊲⊳ R2 R1 R2 R3 R4 R3 R4 bushy join tree linear join tree Katja Hose Distributed Database Systems November 17, 2011 97 / 167 Katja Hose Distributed Database Systems November 17, 2011 98 / 167Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Global query optimizer Global query optimizer Search strategies Search strategiesDeterministic search strategy Systematic generation of query plans Example deterministic search strategies Starting with plans accessing the base relations Dynamic programming Constructing complex plans by combining easier plans, e.g., joining (Almost) exhaustive search by building all possible plans (breadth first) one more relation at each step “Very bad” partial plans are pruned at an early stage Guarantee to find the best plan Only possible for a small number (5-6) of relations Greedy algorithm Only one plan is built (depth-first)Exhaustive search guarantees finding the best plan Katja Hose Distributed Database Systems November 17, 2011 99 / 167 Katja Hose Distributed Database Systems November 17, 2011 100 / 167
    • Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Global query optimizer Distributed cost model Search strategies Distributed cost modelRandomized search strategy Components One or more start plans using a greedy strategy (depth-first search) Cost functions Improving start plans by examining “neighbor plans” Estimating costs to execute operations Neighbor plan: applying transformation rules, e.g., exchanging two Statistics arbitrarily chosen operations Data about relation sizes, attribute domains, value distribution, etc. Better performance with a higher number of relations Formulas Determine cardinalities, sizes of intermediate results, etc.No guarantee to find the best plan Katja Hose Distributed Database Systems November 17, 2011 101 / 167 Katja Hose Distributed Database Systems November 17, 2011 102 / 167Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Distributed cost model Distributed cost model Cost functions Cost functionsTotal execution time Components of total execution time Sum of all costs, i.e., the sum of all processing times at all nodes Local processing costs/time involved in answering the query Tlocal = TCPU · #insts + TI/O · #opsI/O Ttotal = TCPU · #insts + TI/O · #opsI/O + TMSG · #msgs + TTR · #bytes Communication costs/time Tcomm = TMSG · #msgs + TTR · #bytes TCPU time to process a CPU instruction TI/O time for a disk access Coefficients (TCPU , TI/O , TMSG , TTR ) characterize a specific TMSG time to send and receive a message TTR time to transmit a data unit from one node to another distributed database system #bytes is the sum of the sizes of all messages WAN (Wide Area Network): communication time is dominant Typical assumption: TTR is constant – although it might not be true LAN (Local Area Network): also local costs play an important role for remote nodes Katja Hose Distributed Database Systems November 17, 2011 103 / 167 Katja Hose Distributed Database Systems November 17, 2011 104 / 167
    • Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Distributed cost model Distributed cost model Cost functions Total time vs. response time Communication costsResponse time Time that elapses between query initiation and completion Considering parallel local processing and parallel communication Tresponse =TCPU · seq #insts + TI/O · seq #opsI/O + TMSG · seq #msgs + TTR · seq #bytes where seq #x represents the maximum number of instructions Tcommtotal = 2 · TMSG + TTR · (x + y) (insts), I/O operations (opsI/O ), messages (msgs), or bytes (bytes) Tcommresponse = max{TMSG + TTR · x, TMSG + TTR · y} that have to be processed sequentially Minimizing response time does not imply that the total time is also minimized! Katja Hose Distributed Database Systems November 17, 2011 105 / 167 Katja Hose Distributed Database Systems November 17, 2011 106 / 167Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Distributed cost model Distributed cost model Statistics Typical statistics Typical statistics for relation R fragmented as R1 , R2 , . . . , Rr with attributes A1 , . . . , AnGood statistics are crucial Length of each attribute Ai in terms of bytes: length(Ai ) Most important cost factor: Number of distinct values for each attribute Ai and for each fragment Size of intermediate results produced during execution Rj : valuesAi ,Rj := card(πAi (Rj )) Estimating sizes using statistics and formulas Minimum and maximum attribute values: min(Ai ) and max(Ai ) Tradeoff between precision and costs of managing statistics Number of dinstinct values (cardinality) of the attribute domains: card(dom[Ai ]) Number of tuples in each fragment Rj : card(Rj ) Katja Hose Distributed Database Systems November 17, 2011 107 / 167 Katja Hose Distributed Database Systems November 17, 2011 108 / 167
    • Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Distributed cost model Distributed cost model Additional statistics Cardinality estimation AssumptionsAdditional statistics Independence between attributes Histogram for each attribute Ai to approximate the frequency Uniform distribution of attribute values distribution Selectivity Join selectivity factor for some pairs of relations Ratio between expected number of result tuples and tuples of the card(R S) input relation SFJ (R, S) = card(R) · card(S) Expected result size good (high) selectivity: SFJ = 0.001 SF = Cardinality of the input relation bad (low) selectivity: SFJ = 0.5 Example: σF (R) returns 10% of R’s tuples SFS (F, R) = 0.1 (SF selectivity factor) Katja Hose Distributed Database Systems November 17, 2011 109 / 167 Katja Hose Distributed Database Systems November 17, 2011 110 / 167Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Distributed cost model Distributed cost model Cardinality estimation Selection CardinalityAssumptions card(σF (R)) = SFS (F, R) · card(R) Independence between attributes Selectivity Uniform distribution of attribute values Selectivity depends on selection predicates p(A) and constants vCardinality 1 1 SFS (A = v, R) = = Estimate result size (cardinality of the output relation) valuesA,R card(πA (R)) Example: SFS (F, R) = 0.1 v − min(A) SFS (A < v, R) = max(A) − min(A) card(σF (R)) = SFS (F, R) · card(R) max(A) − v SFS (A > v, R) = max(A) − min(A) v2 − v1 SFS (v1 < A < v2 , R) = max(A) − min(A) Katja Hose Distributed Database Systems November 17, 2011 111 / 167 Katja Hose Distributed Database Systems November 17, 2011 112 / 167
    • Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Distributed cost model Distributed cost model Selection Projection CardinalityCardinality Without duplicate elimination card(σF (R)) = SFS (F, R) · card(R) card(πA (R)) = card(R)Selectivity With duplicate elimination (if defined on an arbitrary attribute A): Selectivity depends on selection predicates p(A) and constants v card(πA (R)) = valuesA,R SFS (p(Ai ) ∧ p(Aj ), R) = SFS (p(Ai ), R) · SFS (p(Aj ), R) SFS (p(Ai ) ∨ p(Aj ), R) = SFS (p(Ai ), R) + SFS (p(Aj ), R) − With duplicate elimination (if one of the attributes is the primary key): (SFS (p(Ai ), R) · SFS (p(Aj ), R)) card(πAi ,... (R)) = card(R) Cardinalities for projections on arbitrary combinations of attributes are hard to predict because attribute correlations are unknown Katja Hose Distributed Database Systems November 17, 2011 113 / 167 Katja Hose Distributed Database Systems November 17, 2011 114 / 167Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Distributed cost model Distributed cost model Cartesian product Joins Cardinality Given: R S with R(A, B) and S(B, C) Upper bound: size of the Cartesian productCardinality Natural join on attribute B No B values shared between R and S: card(R × S) = card(R) · card(S) card(R S) = 0 Foreign key relationship R.B → S.B: card(R S) = card(R) All tuples in R.B und S.B have the same value: card(R S) = card(R) · card(S) Katja Hose Distributed Database Systems November 17, 2011 115 / 167 Katja Hose Distributed Database Systems November 17, 2011 116 / 167
    • Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Distributed cost model Distributed cost model Joins Union and Difference CardinalityCardinality Difficult to estimate because duplicates are removed Given: R S with R(A, B) and S(B, C) Union Upper bound: size of the Cartesian product Upper boundNatural join on attribute B card(R ∪ S) = card(R) + card(S) Estimate Lower bound card(R) · card(S) card(R ∪ S) = max{card(R), card(S)} card(R S) = max{valuesB,R , valuesB,S } Difference Store statistics (join cardinality SFJ ) for important joins Upper bound card(R S) = card(R) card(R S) = SFJ · card(R) · card(S) Lower bound card(R S) = 0 Katja Hose Distributed Database Systems November 17, 2011 117 / 167 Katja Hose Distributed Database Systems November 17, 2011 118 / 167Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Distributed cost model Distributed cost model Selectivity estimation using histograms Selectivity estimation using histogramsHistograms In reality distribution of attribute values in a relation is often not Equality predicate uniform Given predicate A = v Histograms consist of a set of buckets bi Identify bucket bi with v ∈ rangeiExample histogram on attribute A of relation R 1 Each bucket bi defined by SFS (A = v, R) = di Range: rangei Range of values in attribute domain dom[A] fi Frequency: fi card(σA=v (R)) = SFS (A = v, R) · fi = di Number of tuples of R where R.A ∈ rangei Distinct values: di Number of distinct values of A where R.A ∈ rangei Katja Hose Distributed Database Systems November 17, 2011 119 / 167 Katja Hose Distributed Database Systems November 17, 2011 120 / 167
    • Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Distributed cost model Join order optimization Selectivity estimation using histograms Phases of optimization PhasesRange predicates 1 Spanning the search space using Given predicate A ≤ v transformation rules Identify buckets that overlap the queried range → equivalent search plans Sum up frequencies 2 Applying a search strategy and a i−1 v − min(rangei ) cost model card(σA≤v (R)) = fi + · fi max(rangei ) − min(rangei ) → choose an efficient plan j=1 Main focus: join trees and join Bucket i only partially overlaps the queried range ordering Katja Hose Distributed Database Systems November 17, 2011 121 / 167 Katja Hose Distributed Database Systems November 17, 2011 122 / 167Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Join order optimization Join order optimization Join order optimization Join order optimization two relationsSimplifying assumptions Determine the join order for two relations R S No distinction between fragments and relations Ignoring local processing time Ignoring other operations (selection, projection) No pipelining Ignoring data transfer to the result site Transfer the smaller relation to minimize the network load Katja Hose Distributed Database Systems November 17, 2011 123 / 167 Katja Hose Distributed Database Systems November 17, 2011 124 / 167
    • Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Join order optimization Join order optimization Join order optimization for three relations Join order optimization with semijoinsDetermine the join order for three relations R A S B T Considering semijoins for joining two relations R (at nodeR ) and S (at 1 R nodeS , nodeS : R = R S, R nodeT , nodeS ) results in three alternatives – assuming A is the join attribute 1 R nodeT : R T A S = (R A S) A S = (R A πA (S)) A S 2 R A S = R A (S A R) 2 S nodeR , nodeR : R = R S, R nodeT , nodeT : R T 3 R A S = (R A S) A (S A R) 3 S nodeT , nodeT : S = S T, S nodeR , nodeR : S R Workflow for alternative 1 4 T nodeS , nodeS : S = S nodeR : S R T, S nodeR , nodeS : compute S = πA (S), send S to nodeR 5 T nodeS , R nodeS , nodeS : R S R nodeR : compute R = R A S , send R to nodeS nodeS : compute R A SPossible orders Transfer costs (neglecting TM SG ) 1 nodeR : send R to nodeS TT R · card(πA (S)) + TT R · card(R A S ) nodeS : compute join R = R S, send R to nodeT Considerung full joins (R A S) only and assuming that nodeT : compute join R T card(R) < card(S), the complete relation R would have been sent to 2 nodeS : send S to nodeR nodeS , costs: TT R · card(R) nodeR : compute join R = R S, send R to nodeT Katja Hose Distributed Database Systems November 17, 2011 125 / 167 Katja Hose Distributed Database Systems November 17, 2011 126 / 167 nodeT : compute join R T 3 nodeS : send S to nodeT node : compute join S = S TDistributed Database Systems T , send S to nodeR Distributed Database Systems node : compute join S Global query optimization R R Global query optimization Join order optimization Total time models 4 nodeT : send T to nodeS SemijoinS vs. joinsjoin S node : compute =S T , send S to nodeR Total time models nodeR : compute join S R 5 nodeT : send T to nodeS Basic strategy nodeR : send R to nodeS nodeS : compute join R S R Coordinator (master) siteConclusion Exhaustive searchDecision Transfer costssizes of the T R · card(πA (S)) + TT R · card(R A S) Based on the semijoin: Tbase relations and intermediate results Optimization objective: total time Transfer exploiting parallelismTT R · card(R) 5 Perhaps costs standard join: of alternative InputThe semijoin is preferable if Relational algebra tree Cost model card(πA (S)) + card(R A S) < card(R) Statistics Location of relations Output Optimized query execution plan Katja Hose Distributed Database Systems November 17, 2011 127 / 167 Katja Hose Distributed Database Systems November 17, 2011 128 / 167
    • Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Total time models Total time models Total time models Site selection and data transferAspects Query shipping Cost model Query initiator (node at which Site selection and data transfer the query is issued/optimized) Join order optimization sends the query to other nodes Join implementation Receiver nodes compute the query result and ship the result back to the initiator Katja Hose Distributed Database Systems November 17, 2011 129 / 167 Katja Hose Distributed Database Systems November 17, 2011 130 / 167Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Total time models Total time models Site selection and data transfer Site selection and data transfer Hybrid shippingData shipping Initiator sends partial queries to other nodes Query remains at the initiator Other nodes execute some parts Initiator sends data request of the query and send messages to other nodes intermediate results to the Receiver nodes ship all required initiator data to the initiator Initiator executes remaining Initiator computes result query operations (post-processing) Katja Hose Distributed Database Systems November 17, 2011 131 / 167 Katja Hose Distributed Database Systems November 17, 2011 132 / 167
    • Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Total time models Total time models Site selection and data transfer for joins Site selection and data transfer for joinsProblem Scenario Queries make extensive use of joins 2 nodes; one (nodeR ) storing relation R the other (nodeS ) storing relation S Computing joins is very expensive The query asks for R S Especially in distributed systems: special attention because of fragments and replication R A B S B C D 3 7Basic strategies 1 1 9 8 8 1 5 1 R S A B C D Ship whole 4 6 9 4 2 1 1 5 1 7 7 Transferring the complete relation 4 5 4 3 3 4 5 7 8 4 2 6 Fetch as needed 6 2 5 7 8 Transferring the relation piecewise 5 7 Katja Hose Distributed Database Systems November 17, 2011 133 / 167 Katja Hose Distributed Database Systems November 17, 2011 134 / 167Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Total time models Total time models Ship whole Ship whole R A B R A B S B C D S B C D 3 7 3 7 9 8 8 9 8 8 1 1 1 1 1 5 1 R S A B C D 1 5 1 R S A B C D 4 6 4 6 9 4 2 1 1 5 1 9 4 2 1 1 5 1 7 7 7 7 4 3 3 4 5 7 8 4 3 3 4 5 7 8 4 5 4 5 4 2 6 4 2 6 6 2 6 2 5 7 8 5 7 8 5 7 5 7Execution at nodeR Execution at nodeS nodeR : send data request message (relation S) to nodeS nodeS : send data request message (relation R) to nodeR nodeS : send requested data (relation S) to nodeR nodeR : send requested data (relation R) to nodeSTotal costs: 2 messages, 18 attribute values Total costs: 2 messages, 14 attribute values Katja Hose Distributed Database Systems November 17, 2011 135 / 167 Katja Hose Distributed Database Systems November 17, 2011 136 / 167
    • Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Total time models Total time models Ship whole Fetch as needed R A B S B C D R A B S B C D 3 7 3 7 9 8 8 1 1 9 8 8 1 1 1 5 1 R S A B C D 1 5 1 R S A B C D 4 6 9 4 2 1 1 5 1 4 6 7 7 4 3 3 4 5 7 8 9 4 2 1 1 5 1 4 5 7 7 6 2 4 2 6 4 3 3 4 5 7 8 5 7 8 4 5 5 7 4 2 6 6 2 5 7 8 5 7 Execution at nodeR nodeR : send data request message (tuples of relation S with B = ‘7 ) to nodeSExecution at a third node nodeX nodeS : send requested data (0 tuples of relation S with B = ‘7 ) to nodeR nodeX : send data request message (relation R) to nodeR nodeR : send data request message (tuples of relation S with B = ‘1 ) to nodeS nodeX : send data request message (relation S) to nodeS nodeS : send requested data (1 tuple of relation S with B = ‘1 ) to nodeR nodeR : send requested data (relation R) to nodeX ... nodeS : send requested data (relation S) to nodeX Total costs: 7 · 2 = 14 messages, 7 + 2 · 3 = 13 attribute valuesTotal costs: 4 messages, 18 + 14 = 32 attribute values Katja Hose Distributed Database Systems November 17, 2011 137 / 167 Katja Hose Distributed Database Systems November 17, 2011 138 / 167Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Total time models Total time models Fetch as needed Ship whole vs. fetch as needed R A B S B C D 3 7 9 8 8 1 1 1 5 1 R S A B C D 4 6 7 7 9 4 4 3 2 3 1 4 1 5 5 7 1 8 Conclusion 4 5 4 2 6 6 2 5 7 5 7 8 Fetch as needed results in a high number of messages Ship whole results in high amounts of transferred dataExecution at nodeS More advanced strategies based on these two basic strategies nodeS : send data request message (tuples of relation R with B = ‘9 ) to nodeR nodeR : send requested data (0 tuples of relation R with B = ‘9 ) to nodeS Semijoin nodeS : send data request message (tuples of relation R with B = ‘1 ) to nodeR Bitvector join nodeR : send requested data (1 tuple of relation R with B = ‘1 ) to nodeS ...Total costs: 6 · 2 = 12 messages, 6 + 2 · 2 = 10 attribute values Katja Hose Distributed Database Systems November 17, 2011 139 / 167 Katja Hose Distributed Database Systems November 17, 2011 140 / 167
    • Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Total time models Total time models Semijoin SemijoinRequesting all join partners in just one stepBasic consideration: R S = R (S R) = R (S πB (R))with B being the join attributeAlgorithm nodeR : determine πB (R) and send the result to nodeS nodeS : determine S = S πB (R) = S R and send result to nodeR nodeR : determine R S =R S Katja Hose Distributed Database Systems November 17, 2011 141 / 167 Katja Hose Distributed Database Systems November 17, 2011 142 / 167Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Total time models Total time models Bitvector join Bitvector join Also known as hash filter join Algorithm Avoiding the transfer of all join attribute values to the other node nodeR : determine πB (R), apply hash function h to the result, set the Transfer bitvector instead BV [1 . . . n] corresponding bits in BV to 1, and send the result to nodeSTransformation nodeS : apply hash function h to the join attribute of relation S, Choose an appropriate hash function h determine S = {t ∈ S|BV [h(t.B)] = 1}, send S to nodeR Apply h to transform attribute values to the range [1 . . . n] nodeR : determine R S =R S Set the corresponding bits in the bitvector BV [1 . . . n] to 1 Katja Hose Distributed Database Systems November 17, 2011 143 / 167 Katja Hose Distributed Database Systems November 17, 2011 144 / 167
    • Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Total time models Total time models Bitvector join Bitvector join Conclusions Transferring the bitvector reduces network load Bitvector only indicates potential join partners because multiple attribute values might map to the same hash value Might result in transferring unnecessary tuples Requirements: an appropriate hash function h and n needs to be large enough to avoid a high number of collisions Katja Hose Distributed Database Systems November 17, 2011 145 / 167 Katja Hose Distributed Database Systems November 17, 2011 146 / 167Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Response time models Response time models Response time models Response time models Two different response times When does the first result tuples arrive? “Classic” cost models consider total resource consumption of a query When have all result tuples arrived? Good results for heavy computational load and slow network connections Example situation By saving resources, many queries can be executed in parallel Given relations/fragments A, B, C, and D (minimum load, maximum throughput) Optimization for short response times Full replication, i.e., all relations/fragments are available on all nodes “Waste” some resources to get query results earlier Compute (A B) (C D) Take advantage of lightly loaded machines and fast connections Assumptions Utilize intraquery parallelism Each join costs 20 time units (TCP U + TI/O ) Transferring an intermediate result costs 10 time units (TM SG + TT R ) Accessing a relation is for free Each node has one computational thread Katja Hose Distributed Database Systems November 17, 2011 147 / 167 Katja Hose Distributed Database Systems November 17, 2011 148 / 167
    • Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Response time models Response time models Example ExampleTwo plans Plan 1: Execute all operations on one node Total costs: 60 Plan 2: Join on different nodes, ship results Total costs: 80 Response time costs: 60 for plan 1, 50 for plan 2 Plan 1 ⇒ Plan 2 is better with respect to response time Because operations can be executed in parallel (exploiting intra-query Plan 2 parallelism)Plan 1 is obviously better with respect to total costs Response time can be improved even more by applying pipelining Katja Hose Distributed Database Systems November 17, 2011 149 / 167 Katja Hose Distributed Database Systems November 17, 2011 150 / 167Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Response time models Response time models Pipelining PipeliningGoal of applying pipeliningGood first tuple response times by executing queries in a pipelined fashion Problems Operations have different execution times Not pipelined If execution speed of operations in the pipeline differs, tuples are Each operation is fully completed and an intermediate result is created either cached or the pipeline blocks Next operation reads intermediate result and is then fully completed Some operations more suitable than others Reading and writing of intermediate results costs resources Good: scan, select, project, union, . . . Pipelined Tricky: join, intersection, . . . Operations do not create intermediate results Very hard: sort Each processed tuple is fed directly into the next operation Tuples “flow” through the operations Katja Hose Distributed Database Systems November 17, 2011 151 / 167 Katja Hose Distributed Database Systems November 17, 2011 152 / 167
    • Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Response time models Response time models Pipelining example Pipelining exampleSimple query Simple query Tablescan, selection, projection Tablescan, selection, projection 1000 tuples are scanned, selectivity is 0.1 1000 tuples are scanned, selectivity is 0.1Costs Costs Accessing one tuple during tablescan: 2 time units Accessing one tuple during tablescan: 2 time units Selecting (testing) one tuple: 1 time unit Selecting (testing) one tuple: 1 time unit Projecting one tuple: 1 time unit Projecting one tuple: 1 time unit Non-Pipelined time event Pipelined time event 2 first tuple in IR1 2 first tuple finished table scan 2000 all tuples in IR1 3 first tuple finished selection (if selected. . . ) 2001 first tuple in IR2 4 first tuple in Result 3000 all tuples in IR2 3098 last tuple finished tablescan 3001 first tuple in Result 3099 last tuple finished selection 3100 all tuples in Result 3100 all tuples in Result Katja Hose Distributed Database Systems November 17, 2011 153 / 167 Katja Hose Distributed Database Systems November 17, 2011 154 / 167Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Response time models Response time models Pipelining example Pipelining exampleJoin query Costs Joining two table subsets using a non-pipelined 1000 tuple are scanned in each pipeline, BNL(Block-Nested-Loop) join selectivity 0.1 Both pipelines run in parallel Joining 100 100 tuples: 10.000 time units (one time unit per combination) Response time The first tuple arrives at the end of any pipeline after 4 time units All tuples have arrived at the end of the pipelines after 3.100 time units Final result will be available after 13.100 time units No benefit from pipelining with respect to response time First tuple arrives long after step 3.100 Katja Hose Distributed Database Systems November 17, 2011 155 / 167 Katja Hose Distributed Database Systems November 17, 2011 156 / 167
    • Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Response time models Response time models Joins and pipelining Single-pipelined hash join “Classic” join algorithmSuboptimal result because of the unpipelined join Basic idea A B One input relation is read from an intermediate result (A), the other is Most traditional join algorithms are unsuitable for pipelining pipelined through the join operation (B) Single/semi-pipelined: only one pipeline, the other intermediate result All tuples of A are stored in a hash table Hash function is used on the join attribute has to be available All tuples with the same hash value for the join attribute are in the Fully pipelined: both inputs are processed in a pipelined fashion same bucket Every incoming tuple (via pipeline) of B is hashed by join attributes Compare tuple to each tuple in the respective A bucket Return those tuples showing matching join attributes Katja Hose Distributed Database Systems November 17, 2011 157 / 167 Katja Hose Distributed Database Systems November 17, 2011 158 / 167Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Response time models Response time models Double-pipelined hash join Double-pipelined hash join – example Dynamically build hashtables for A and B tuples – memory intensive! Process tuples upon arrival Cache tuples if necessary B(31, B2) arrives Balance between A and B tuples for better performance Rely on statistics for a good A:B ratio Insert into B Hash If a new tuple arrives of relation A Find matching A tuples Insert it into the A hashtable Found A3 Check in the B hashtable if there are join partners Assume that A3 matches B3. . . If yes, return all combined AB tuples Add AB(A3, B2) to the result If a new B tuple arrives, process it analogously Katja Hose Distributed Database Systems November 17, 2011 159 / 167 Katja Hose Distributed Database Systems November 17, 2011 160 / 167
    • Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Response time models Response time models Pipelining in distributed setups Pipelining in distributed setups – tuple blockingIn pipelines, tuples “flow” through the operations Works well with one processing unit! (one node) Minimize communication overhead by tuple blocking Problem: sending each tuple in separate from one node to another Do not send single tuples, but blocks containing multiple tuples might be inefficient Burst transmission Communication costs Packets have to be cached Setting up transfer and opening communication channel Block size should be at least the packet size of the underlying network Composing a message protocol Transmitting message: header information and payload (minimum packet size is bigger than tuple) Results in even more cost factors for the cost model Receiving and decoding a message Closing the channel Katja Hose Distributed Database Systems November 17, 2011 161 / 167 Katja Hose Distributed Database Systems November 17, 2011 162 / 167Distributed Database Systems Distributed Database Systems Global query optimization Summary Summary on global query optimization Summary I Detour on centralized query processing Query parsing Query transformationGlobal query optimization has to deal with additional constraints and cost Query optimizationfactors compared to “classic” query optimization Basics of distributed query optimization Many steps can be reused from centralized query processing Network costs, network model, shipping policies Optimization in distributed systems is much more complex (network Fragmentation and allocation schemes latency, selectivities, communication costs, response time, etc.) Different optimization goals (response time vs. total time) Meta data management – where to store the global catalog? Data localization – consider fragmentation Distributed query optimization Very important question: where to execute which parts of the query? When to optimize: compile time vs. dynamic optimization, most common: semi-dynamic and hierarchical optimization Cost model (cost functions, statistics, cardinality estimation, etc.) Katja Hose Distributed Database Systems November 17, 2011 163 / 167 Katja Hose Distributed Database Systems November 17, 2011 164 / 167
    • Distributed Database Systems Distributed Database Systems Summary Summary Summary II References I ¨ M. Tamer Ozsu, P. Valduriez. Principles of Distributed Database Systems. Join order optimization Third Edition, Springer, 2011. Join implementations (ship whole, fetch as needed, semijoin, bitvector join, pipelined hash join, etc.) E. Rahm. Total time and response time Mehrrechner-Datenbanksysteme. Addison-Wesley, Bonn, 1994. P. Dadam. Verteilte Datenbanken und Client/Server-Systeme. Springer-Verlag, Berlin, Heidelberg 1996. Katja Hose Distributed Database Systems November 17, 2011 165 / 167 Katja Hose Distributed Database Systems November 17, 2011 166 / 167Distributed Database Systems Summary References II Toby J. Teorey Database modeling and design Third Edition, Morgan Kaufmann Publishers, San Francisco, CA, 1999. D. Kossmann. The State of the Art in Distributed Query Processing, ACM Computing Surveys, Vol. 32, No. 4, 2000, S. 422-469. Katja Hose Distributed Database Systems November 17, 2011 167 / 167