Autonomics and Data Management
Upcoming SlideShare
Loading in...5
×
 

Autonomics and Data Management

on

  • 422 views

 

Statistics

Views

Total Views
422
Views on SlideShare
422
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Autonomics and Data Management Autonomics and Data Management Presentation Transcript

  • Autonomics and Data Management Norman Paton University of Manchester
  • Hypothesis
      • If database management systems are to be effective in an increasing range of challenging environments, such as grids, then automation will have follow them into these new settings.
  • Outline
    • Existing examples of automation.
    • Limitations in current practice.
    • Opportunities presented by ubiquitous automation.
  • Outline
    • Existing examples of automation:
      • Database administration.
      • Query processing.
      • Data integration.
    • Limitations in current practice.
    • Opportunities presented by ubiquitous automation.
  • Example: Database Administration
    • Database administration involves setting values for a lot of controls:
      • Where to put indexes.
      • What views to materialise.
      • How to allocate memory.
      • Maximum number of concurrent transactions.
      • Which disks to place data on.
      • Which statistics to maintain.
      • How often to refresh statistics.
      • Which transaction isolation level to use.
    • Autonomic database administration may set any of these automatically.
  • Multiprogramming Level
    • The multiprogramming level (MPL) indicates the maximum number of concurrent transactions that may be run.
    • Problem: excessive lock conflicts may lead to thrashing, either through deadlocks or significant amounts of blocking.
    • Setting the MPL level:
      • If too high, then risk of thrashing.
      • If too low, then too many jobs waiting in queue.
    • The risk of thrashing at a given MPL depends on the update intensity of the transactions.
    • G. Weikum, A. Mönkeberg, C. Hasse, P. Zabback: Self-tuning Database Technology and Information Services: from Wishful Thinking to Viable Engineering. VLDB 2002: 20-3.
  • Automating the Setting of MPL – 1
    • Observation:
      • Want to set the MPL as high as possible, but not too high!
      • Identify a property that indicates that there is a high risk of conflicts.
      • Conflict ratio:
        • (# locks held by all transactions / # locks held by non-blocked transactions)
        • Experimental and analytical studies indicated that a level of 1.3 or more means there is a high risk of thrashing.
  • Automating the Setting of MPL – 2
    • Monitoring:
      • Number of active transactions.
      • Number of blocked transactions.
    • Assessment:
      • Conflict ratio exceeds 1.3.
    • Response:
      • Transaction admission policy:
        • Block admission of new transactions from queue.
      • Transaction cancellation policy:
        • Cancel one or more blocking transactions.
  • Example: Query Evaluation
    • Query optimization involves making lots of decisions:
      • Which operators to use.
      • What order to evaluate the operators in.
      • What parallelism level to use.
      • How to allocate work to parallel nodes.
    • Adaptive query processing may revise any of the decisions made by a query optimizer during query evaluation.
  • Adaptation for Load Balancing
    • In partitioned parallelism, a task is divided into subtasks that are run in parallel on different nodes.
    • For a join, A ⋈ B is represented as the union of the results of plan fragments F i = A i ⋈ B i , for i = 1 .. P , where P is the level of parallelism.
    • The time taken to evaluate the join is max(evaluation_time(F i )), for i = 1.. P .
    • As a result, any delay in completing a fragment F i delays the completion of the operator, so it is crucial to match fragment size to node capabilities.
    • Many join algorithms have state; as such changing the size of a fragment allocated to a machine involves replicating or relocating operator state.
  • Load Balancing: Flux
    • When load imbalance is detected:
      • Halt query execution.
      • Compute new distribution policy (dp).
      • Update hash tables by transferring data between nodes.
      • Update dp in parent exchange nodes.
      • Resume query execution.
    • M. Shah, J.M. Hellerstein, S. Chandrasekaran, M.J. Franklin, Flux: An Adaptive Partitioning Operator for Continuous Query Systems. 25-36, ICDE 2003.
    Scan(A) Join(A 1 ,B 1 ) Join(A 2 ,B 2 ) Hash table A 1 dp Hash table A 2
  • Load Balance: Dynamic Hashing
    • Hash table build time:
      • Partition every hash table bucket over three randomly chosen nodes.
      • Store every tuple in the 2 most lightly loaded of the 3 nodes.
    • Hash table probe time:
      • Probe the 2 hash tables on the 2 most lightly loaded nodes storing the bucket, the primary being that with the lightest load, and the secondary that with the second lightest load.
      • Match tuples on the primary node unless a matching tuple is available only on the secondary node.
    • N.W. Paton, V. Raman, G. Swart, I. Narang, Autonomic Query Parallelization using Non-dedicated Computers: An Evaluation of Adaptivity Options, Proc. ICAC , 221-230, 2006
    Scan(A) Join(A 1 ,B 1 ) Hash table A 1 Join(A 3 ,B 3 ) Hash table A 3 Join(A 2 ,B 2 ) Hash table A 2 Scan(B)
  • Load Balance: Redundant Work
    • Adapt retrospectively – when a plan fragment A i ⋈ B i is late completing, start evaluating a redundant version of the fragment.
    • Typically, for parallelism level P , assuming a perfect hash function and no skew, each node will join | A |/ P to | B |/ P tuples.
    • However, this leads to tight dependencies between successive joins.
    Scan(A) Scan(B) Join(A 1 ,B 1 ) Join(A 2 ,B 2 ) Join(C 1 , Join(A,B) 1 ) Join(C 2 , Join(A,B) 2 )
  • Load Balance: Redundant Work
    • An alternative distribution strategy allocates vertical slices of the plan to a single node.
    • For this to work, the workload allocated to each node must be larger than that allocated in the case of exchange.
    • For example, if there are 2 nodes, then for C ⋈( A ⋈ B ), one option is to join | A | to | B |/2 tuples on each node , with the result joined with all of C.
    • Whenever a fragment is late completing, a redundant version is started.
    V. Raman, W. Han, Inderpal Narang: Parallel Querying with Non-Dedicated Computers. 61-72, VLDB 2005 Scan(A) Scan(B) Join(A,B 1 ) Join(A,B 2 ) Join(C, Join(A,B 1 )) Join(C, Join(A,B 2 ))
  • Example: Data Integration
    • Data integration involves assembling information about the relationships between sources:
      • What sources there are.
      • The services provided by the source.
      • The concepts represented in each source.
      • How the data represented.
      • What relationships there are between extents.
      • What mappings exist between source data types.
    • Autonomic data integration involves inferring some of the above data.
  • Inferring Web Service Annotations
    • Web service annotations are useful for:
      • discovering services.
      • composing workflows.
      • characterising and identifying mismatches.
    • However , service annotation is expensive:
      • knowledge of the ontology used for annotation.
      • knowledge of the web services to be annotated.
    • (Semi)automatic annotation can be carried out using:
      • schema matching and text classification techniques.
      • workflow specifications.
      • K. Belhajjame, S.M. Embury, N.W. Paton, N.W., R. Stevens and C.A. Goble, Automatic Annotation of Web Services Based on Workflow Definitions, P roc. 5th Intl. Semantic Web Conference , Springer, 116-129, 2006.
  • Inferring Web Service Annotations
    • Use workflows to infer information about the semantics of linked parameters:
  • Summary on Examples of Automation
    • Data management and integration are complex, with many possibilities to benefit from automation.
    • Automation has been applied in many different settings, with many worthwhile results.
    • The diversity in approaches to and technologies associated with automation is great.
  • Outline
    • Existing examples of automation.
    • Limitations in current practice.
    • Opportunities presented by ubiquitous automation.
  • Outline
    • Existing examples of automation.
    • Limitations in current practice:
      • Predictability.
      • Methodology.
      • Composability.
      • Semantics.
    • Opportunities presented by ubiquitous automation.
  • Limitations: Predictability
    • Adaptive systems change system behaviour in response to runtime feedback. Risks include:
      • Reacting too quickly in response to temporary effects.
      • Reacting too slowly to be effective.
      • Reacting in a way that makes things worse.
    • It can be difficult for developers of adaptive systems to predict how effective their proposals might be.
    • It sometimes takes several attempts to refine an adaptive strategy.
  • Adaptive Load Balancing: Comparison
    • Several existing strategies were compared, across a range of environmental conditions.
    • Conditions could be identified in which all of the proposals were worse than not adapting.
    • Published evaluations of the existing proposals gave no indication of problematic cases.
    • Several of the developers did not know under which circumstances their approaches performed poorly.
    • N.W. Paton, V. Raman, G. Swart, I. Narang, Autonomic Query Parallelization using Non-dedicated Computers: An Evaluation of Adaptivity Options, Proc. ICAC , 221-230, 2006.
  • Adaptive Load Balancing: Experiment
    • Query:
      • P ⋈ PS ( P has 200,000 tuples, PS has 800,00 0 tuples).
      • Simulation of parallel run on three nodes.
    • Types of imbalance:
      • Constant : A consistent external load exists on one of the nodes throughout the experiment. The level of the external load represents the number of external tasks that are seeking to make full-time use of the machine.
      • Periodic : The load on one of the machines comes and goes during the experiment. The duration of the load indicates for how long each load spike lasts; and the repeat duration represents the gap between load spikes.
  • Results: Constant Imbalance
  • Periodic Imbalance (1s)
  • Designing Adaptive Strategies
    • Overheads : pessimistic strategies carry out additional work on the assumption that things will go wrong (e.g. replicating data).
    • Adaptation costs: optimistic strategies evaluate queries as normal, but may pay a high price to carry out specific adaptations when required.
    Overheads Adaptation Cost Adapt-5 Adapt-4 Adapt-2 Adapt-3 Adapt-1
  • Limitations: Methodology
    • Adaptive data management proposals are generally described as specific algorithms or techniques:
      • It is often not clear what methodology has been followed in their development.
      • It is not necessarily clear if there are well established techniques that could have been used to direct their design.
    • Approaches that have been applied in the design of adaptive systems include:
      • Systematic functional decomposition.
      • Control theory.
  • Autonomic Computing Architecture
    • Autonomic systems typically involve a control loop, with monitoring information driving planning and decision making.
    • IBM’s Autonomic Computing Toolkit provides components that implement a functional decomposition known as MAPE (Monitor, Analyze, Plan and Execute).
    • The toolkit provides implementations for several of the components (in particular Monitor and Analyze ).
    J.O. Kephart, D.M. Chess, The Vision of Autonomic Computing, IEEE Computer, 36(1), 41-50, 2003.
  • Data Management and MAPE
    • Sensors : what monitoring information should a database platform expose to enable effective decision making?
    • Effectors : what hooks should a database platform expose to enable effective runtime modification?
    • It is not straightforward:
      • to retrofit sensing and effecting functionality.
      • to predict what may be required.
    • Monitor , Analyze , Plan and Execute components may also be able to be implemented in different ways.
    • Generic monitoring components have been proposed for tracking query progress and for adaptation:
      • A. Gounaris, N.Paton, A. Fernandes, R. Sakellariou, Self-Monitoring Query Execution for Adaptive Query Processing, Data and Knowledge Eng. , 51(3), 325-348, 2004.
      • L. Luo, J. Naughton, C. Ellmann, M. Watzke, Towards a progress indicator for database queries, SIGMOD, 791-802, 2004.
  • Monitoring Query Progress
    • Progress monitoring predicts properties of an operator incrementally from monitored data.
    • Raw monitoring data may count the number of tuples returned by an operator, the average tuple size, etc.
    • From such information, operator selectivity, result size and runtime can be estimated.
    • Unnest :
      •  = (n out / n in )
      • cardinality = cardinality operand * 
      • size = cardinality operand *  * avg(size result_tuple )
      • time = cardinality operand *  * tuple_build_cost
  • Building Adaptive Databases
    • Most adaptive database extensions involve hard coding changes to the existing code base.
      • Complex core infrastructure subject to intrusive changes.
      • Steep learning curve for developers of adaptive extensions.
      • Incremental changes result in reduced reuse.
    • With respect to MAPE:
      • Growing experience with generic monitoring.
      • Considerable diversity in Analyze , Plan and Execute .
      • Control theory provides some insights into decision making.
  • Control Theory
    • Provides a systematic framework for computing a change to an input given a measured output .
    • Designs seek to exhibit SASO properties:
      • S table: bounded input gives bounded output.
      • A ccurate: measured output converges on desired value.
      • S hort Settling: converges to stable value quickly.
      • No O vershoot: achieves objectives in a steady manner.
    • Either find a control engineer, learn the book, or apply a well established model.
      • J.L. Hellerstein, Y. Diao, S. Parakh, D.M. Tilbury, Feedback Control of Computing Systems, Wiley, 2004.
  • Control Theory: PID Controllers Source: http://en.wikipedia.org/wiki/PID_control
  • PID Controllers Example
    • Task: evaluating queries from a queue over a server.
    • Objective: keep all query evaluation in memory to avoid use of multi-pass algorithms.
    • Goal for controller: keep the amount of free memory at 512Mb in order to ensure condition met.
    • Control parameter: multiprogramming level.
  • Proportional Controller
    • Terminology:
      • m : output signal.
      • K p : proportional gain.
      • e : error.
    • Definition: m = K p e .
    • Query processing example:
      • m : multiprogramming level.
      • e : (amount of free memory – 512Mb).
      • K p : 1/( job size in Mb): assumed 0.01, as 100 Mb jobs .
  • Proportional Controller: Example 10.24 1024 5.12 512 2.56 256 0 0 -2.56 -256 -5.12 -512 -10.24 -1024 m: Multiprogramming Level Change e: Error
  • Integrative and Derivative Controllers
    • Integrative Controller:
      • Controller output depends on level and duration of error.
      • K i : proportional gain.
      • T i : integral time.
      • Definition:
    • Differential Controller:
      • Controller output depends on rate of reduction in error.
      • K d : differential gain.
      • T d : derivative time.
      • Definition:
    . K i . K d
  • Control Theory for Data Management
    • There are currently rather few examples of control theory being used in data management. Recent example in grid query processing:
      • Anastasios Gounaris, Christos Yfoulis, Rizos Sakellariou and Marios Dikaiakos, Self-optimizing Block Transfer in Web Service Grids, WIDM, 2007.
    • Modelling the relationship between measured values and controlled inputs can be challenging.
    • Many adaptive data management techniques change more than an input parameter. For example:
      • A query may be reoptimized by an adaptive query processor.
  • Limitations: Composability
    • Many proposals for autonomic data management focus on specific adaptations:
      • Selecting views for materialization.
      • Selecting data for replication.
      • Selecting fields for indexing.
      • Allocation of memory to functions.
    • … however, such decisions are often inter-related, and modelling the inter-relationships between such strategies is challenging.
  • Query Processing Inter-Dependency
    • Load imbalance results from inappropriate allocation of work to resources in partitioned parallelism.
    • Bottlenecks result from inappropriate allocation of work to resources in pipelined parallelism.
    • There is no benefit from resolving load imbalance if the bottleneck is elsewhere in the plan.
    • Resolving load imbalance may change the location of the bottleneck.
    join join join join A B C coordinator Change Allocation join Remove Bottleneck
  • Limitations: Semantics
    • Property guarantees:
      • Autonomic systems change behaviour mid-task.
      • Non-trivial adaptations may leave uncertainty as to whether an adaptation is meaning-preserving.
      • Few adaptations have had their meaning-preserving properties proved:
        • K. Eurviriyanukul, A. Fernandes, N. Paton, A Foundation for the Replacement of Pipelined Physical Join Operators in Adaptive Query Processing, EDBT Workshops, 589-600, 2006.
  • Limitations: Semantics
    • Performance guarantees:
      • Autonomic behaviour may take certain risks with performance.
      • Some proposals may redo work, leading to the need for thresholds to remove the risk of continuous reoptimization:
        • V. Markl, V. Raman, D. Simmen, G. Lohman, H. Pirahesh: Robust Query Processing through Progressive Optimization. SIGMOD Conference 2004: 659-67.
      • Some algorithms provide bounded worst case performance:
        • Daniel M. Yellin: Competitive algorithms for the dynamic selection of component implementations. IBM Systems Journal 42(1): 85-97 (2003).
  • Summary on Limitations of Automation
    • Automation is currently partial in scope and often ad hoc in development.
    • Automation is a second class citizen in data management; there is interest in the benefits it can bring but not so much in automation per se .
    • As a result, automation in data management can be seen as immature, with considerable scope for improving the predictability, composability and clarity of proposals through enhanced methodologies.
  • Outline
    • Existing examples of automation.
    • Limitations in current practice.
    • Opportunities presented by ubiquitous automation.
  • Outline
    • Existing examples of automation.
    • Limitations in current practice.
    • Opportunities presented by ubiquitous automation:
      • Increasing manageability of database technologies.
      • Extending the reach of database technologies.
  • Increasing Manageability - 1
    • Database products:
      • Commercial database systems are typically associated with high total cost of ownership , resulting in significant measure from high administrative costs.
      • Vendors are seeking to improve competitiveness by automating or supporting management of their intrinsically complex products.
    • Data management components:
      • It has been suggested that current database products are too complex, and that more data should be managed by lighter weight components.
      • As of yet, there is little evidence that light-weight data management components are being designed with automation in mind, but this is perhaps a practical proposition.
  • Increasing Manageability - 2
    • There are increasing needs to manage personal data, and data management within workgroups or laboratories is often hindered by the complexity of current data management platforms.
    • Personal and workgroup data management often has evolving requirements, but rarely needs the full range of capabilities of current database products.
    • Proposals in this space:
      • Data services: I. Subasu, P. Ziegler, K. Dittrich: Towards Service-Based Database Management Systems. BTW Workshops 2007: 296-30.
      • Data components: S. Chaudhuri, G. Weikum: Rethinking Database System Architecture: Towards a Self-Tuning RISC-Style Database System. VLDB 2000: 1-1.
  • Increasing Reach - 1
    • Most automation in data management has sought to ask the question:
      • Which current requirements can be met better by increasing the ranges of tasks that are carried out automatically?
    • An alternative view gives rise to a different question:
      • If we assume that there is to be no manual administration, what sorts of data management system can be developed?
  • Increasing Reach - 2
    • The vision of dataspaces is to support database style access over diverse sources with minimal manual integration.
      • A. Halevy, M. Franklin, D. Maier: Principles of dataspace systems. PODS 2006: 1-9.
    • Preliminary proposals match schemas automatically but partially, thus giving approximate answers that can be ranked.
      • J-P. Dittrich, M. Salles: iDM: A Unified and Versatile Data Model for Personal Dataspace Management. VLDB 2006: 367-378.
      • S. Abiteboul, N. Polyzotis: The Data Ring: Community Content Sharing. CIDR 2007: 154-16.
    • The challenge is to enable querying over structured data in a personal file store, within an organisation or at internet scale, with no manual integration.
  • Conclusions
    • Automation is already in lots of places:
      • Database administration.
      • Query evaluation.
      • Data integration.
    • Automation in data management is not mature:
      • Predictability.
      • Methodology.
      • Composability.
      • Semantics.
    • If automation becomes a more central focus:
      • Understanding of automation per se should improve.
      • The nature of data management systems will change.