Database Tuning –  Self-Tuning  Databases Presented by- Ankur A Kath (kath@cs.umn.edu) Afshan Jabeen (jabee002@cs.umn.edu) Graduate Students – Department of Computer Science (University of Minnesota – Twin Cities)
What are Self-Tuning Databases? Databases that are capable of: Managing and maintaining themselves Adjusting to various circumstances Preparing their resources for efficient handling of heterogeneous workloads. Also called as  “Autonomic DBMS” Motivation: Increased emphasis on QoS  Advances in DB functionality, connectivity, availability and heterogeneity Ongoing maintenance Burgeoning database size E-Service era
Characteristics of Self-Tuning Databases: Self-optimizing:  DBMS that allows most optimal performance for given the workload parameters, available resources and environment settings. Self-Configuring: DBMS should recognize the changes in its environment that warrant re-configuration and reconfigure itself without severely disrupting operations. Self-Healing: DBMS remains in , or can be restored to a consistent state at all times. Self-Protecting: DBMS Includes features that shield it from potential, errant requests that may deteriorate its performance. Self-Organizing: DBMS should be capable of dynamic re-organization and re-structuring. Self-Inspecting: DBMS should “ know itself”  in order to make intelligent decisions about all of the above.
AutoAdmin: Self-Tuning Database Initial thought … Focus on automating the physical design of the relational database  The outcomes … Tuning technology in Microsoft SQL server Self-tuning Histograms Monitoring infrastructure The Physical Database Design “Problem” –  Given the storage bounds , optimizer-estimation costs and the workload – which is the best physical design? Choice of  physical design = f ( usage profile of DB server, optimization overhead, storage) Challenges: Multiple physical design features  Interaction due to updates and storage Scaling challenges AutoAdmin’s Ideas:  Recognize the frequent item-sets  Workload compression Merge and Reduce Enumeration:  top-down, bottom-up
Customizing Physical Design structure – AutoAdmin “What-if” analysis To reduce the overhead of DB  administration:  ability to select the right indices  Ability to perform  quantitative analysis  of existing indices Propose hypothetical (“what-if”)  indexes and quantitatively analyze their impact  on the performance. Architecture: Hypothetical Configuration Analysis (HCA)  supports- simulation of hypothetical configuration summary analysis of results of simulation Steps in “what-if” analysis: Define workload Define hypothetical configuration Evaluate: Summary Analysis Estimate the cost of queries in the workload  if  it was “real”
Online AutoAdmin (Physical Design Tuning) Problems with the solutions for automated physical design: Explicit invocations of the tuning tools Manual workload gathering Online AutoAdmin: “ Always-on” component Dynamically modifies the physical design to varying workload / data characteristics Architecture: Extended Metadata manager: supports “candidate hypothetical indexes” Internal representation of indexes Leverage the pre-processing in the query optimization stage Analyze cost/benefit ratio: determine design change (dropping/creating indexes) Send the design change request to Task Manager “ On-the-fly”  approach: generate dynamic  access-path-requests
Design Schemes – Indexing schemes Static Index management: Uses index wizards Requires DBA’s effort for index creation and maintenance Dynamic Index management and autonomous tuning Create indexes automatically using index-building queries Processing index-building queries For a given query Q, potentially useful indexes are determined For a query Q, a cost optimized plan is derived The query Q is re-optimized using virtual indexes Profits are computed as the difference  of the costs of plans from step 2 & 3 Dynamic self-tuning Index structures – Motivation  Data balanced instead of access balanced structures Coarse granularity of tuning Unawareness of system resource usage E.g. Adaptable Binary Tree
Design Schemes – Self-tuning Histograms Traditional Histograms: Cost of building and maintaining is high Have to be rebuild when data changes Multi-dimensional Histograms are complicated to build Approach for Self-tuning Histograms Build histograms  not  by examining data but by using  ‘free’  feedback information Initial Histogram ST histogram ‘h’ on an attribute ‘a’; we need to know number of histogram buckets ‘B’, the number of tuples in relation, T and the max and min values of ‘a’ Uniformity assumption – Bucket frequency T/B tuples Refining Bucket frequencies act – actual size of selection using histogram est – estimated size  of selection using histogram esterr = act – est (absolute estimation error Change  in frequency of buckets – Assign the blame for the error to the buckets used for estimation in proportion to their current frequencies
Design Schemes – Using Reflection Automated diagnosis of the possible sources of performance problems Resource model Diagnosis rules Workload model Diagnosis tree Diagnosis Adaptation Execution Inspection DBMS expertise and documentation Approach: Model based approach – users define models of their system’s resources and workload and provide a set of diagnostic rules, from this information we generate diagnosis tree Implementation: Reflective DBMS – maintains a model of self-representation and changes to it are automatically reflected in the underlying system. Reflection enables inspection and adaptation of systems at run time DTW Data Self representation Performance data warehouse DBMS Software Application Diagnosis function Monitor
The Taxonomy Tuning to Self-tuning (Manual to Automatic) Major focus on physical database design – AutoAdmin Project Why Physical design? Physical database design determines how efficiently a query is executed in a DBMS The physical database design “problem” For a given workload, find a  configuration  i.e. a set of indexes and minimize the cost. All papers recognized the in-feasibility by performing analysis on “real” stuff Feedback Control Loop –  “on-the-fly”  physical design decisions Hypothetical scenarios – AutoAdmin’s “What-if” analysis, Self-tuning histograms Design Schemes Moving from Static indexes to dynamic indexes to dynamic index-structures Traditional histograms to self-tuning histograms Bridging the gap between monitoring and parameter adjustment The results:  Self-managing DBMS,  Low maintenance cost,  Removal of manual error Performance guarantee (criticality)

AutoAdmin Survey

  • 1.
    Database Tuning – Self-Tuning Databases Presented by- Ankur A Kath (kath@cs.umn.edu) Afshan Jabeen (jabee002@cs.umn.edu) Graduate Students – Department of Computer Science (University of Minnesota – Twin Cities)
  • 2.
    What are Self-TuningDatabases? Databases that are capable of: Managing and maintaining themselves Adjusting to various circumstances Preparing their resources for efficient handling of heterogeneous workloads. Also called as “Autonomic DBMS” Motivation: Increased emphasis on QoS Advances in DB functionality, connectivity, availability and heterogeneity Ongoing maintenance Burgeoning database size E-Service era
  • 3.
    Characteristics of Self-TuningDatabases: Self-optimizing: DBMS that allows most optimal performance for given the workload parameters, available resources and environment settings. Self-Configuring: DBMS should recognize the changes in its environment that warrant re-configuration and reconfigure itself without severely disrupting operations. Self-Healing: DBMS remains in , or can be restored to a consistent state at all times. Self-Protecting: DBMS Includes features that shield it from potential, errant requests that may deteriorate its performance. Self-Organizing: DBMS should be capable of dynamic re-organization and re-structuring. Self-Inspecting: DBMS should “ know itself” in order to make intelligent decisions about all of the above.
  • 4.
    AutoAdmin: Self-Tuning DatabaseInitial thought … Focus on automating the physical design of the relational database The outcomes … Tuning technology in Microsoft SQL server Self-tuning Histograms Monitoring infrastructure The Physical Database Design “Problem” – Given the storage bounds , optimizer-estimation costs and the workload – which is the best physical design? Choice of physical design = f ( usage profile of DB server, optimization overhead, storage) Challenges: Multiple physical design features Interaction due to updates and storage Scaling challenges AutoAdmin’s Ideas: Recognize the frequent item-sets Workload compression Merge and Reduce Enumeration: top-down, bottom-up
  • 5.
    Customizing Physical Designstructure – AutoAdmin “What-if” analysis To reduce the overhead of DB administration: ability to select the right indices Ability to perform quantitative analysis of existing indices Propose hypothetical (“what-if”) indexes and quantitatively analyze their impact on the performance. Architecture: Hypothetical Configuration Analysis (HCA) supports- simulation of hypothetical configuration summary analysis of results of simulation Steps in “what-if” analysis: Define workload Define hypothetical configuration Evaluate: Summary Analysis Estimate the cost of queries in the workload if it was “real”
  • 6.
    Online AutoAdmin (PhysicalDesign Tuning) Problems with the solutions for automated physical design: Explicit invocations of the tuning tools Manual workload gathering Online AutoAdmin: “ Always-on” component Dynamically modifies the physical design to varying workload / data characteristics Architecture: Extended Metadata manager: supports “candidate hypothetical indexes” Internal representation of indexes Leverage the pre-processing in the query optimization stage Analyze cost/benefit ratio: determine design change (dropping/creating indexes) Send the design change request to Task Manager “ On-the-fly” approach: generate dynamic access-path-requests
  • 7.
    Design Schemes –Indexing schemes Static Index management: Uses index wizards Requires DBA’s effort for index creation and maintenance Dynamic Index management and autonomous tuning Create indexes automatically using index-building queries Processing index-building queries For a given query Q, potentially useful indexes are determined For a query Q, a cost optimized plan is derived The query Q is re-optimized using virtual indexes Profits are computed as the difference of the costs of plans from step 2 & 3 Dynamic self-tuning Index structures – Motivation Data balanced instead of access balanced structures Coarse granularity of tuning Unawareness of system resource usage E.g. Adaptable Binary Tree
  • 8.
    Design Schemes –Self-tuning Histograms Traditional Histograms: Cost of building and maintaining is high Have to be rebuild when data changes Multi-dimensional Histograms are complicated to build Approach for Self-tuning Histograms Build histograms not by examining data but by using ‘free’ feedback information Initial Histogram ST histogram ‘h’ on an attribute ‘a’; we need to know number of histogram buckets ‘B’, the number of tuples in relation, T and the max and min values of ‘a’ Uniformity assumption – Bucket frequency T/B tuples Refining Bucket frequencies act – actual size of selection using histogram est – estimated size of selection using histogram esterr = act – est (absolute estimation error Change in frequency of buckets – Assign the blame for the error to the buckets used for estimation in proportion to their current frequencies
  • 9.
    Design Schemes –Using Reflection Automated diagnosis of the possible sources of performance problems Resource model Diagnosis rules Workload model Diagnosis tree Diagnosis Adaptation Execution Inspection DBMS expertise and documentation Approach: Model based approach – users define models of their system’s resources and workload and provide a set of diagnostic rules, from this information we generate diagnosis tree Implementation: Reflective DBMS – maintains a model of self-representation and changes to it are automatically reflected in the underlying system. Reflection enables inspection and adaptation of systems at run time DTW Data Self representation Performance data warehouse DBMS Software Application Diagnosis function Monitor
  • 10.
    The Taxonomy Tuningto Self-tuning (Manual to Automatic) Major focus on physical database design – AutoAdmin Project Why Physical design? Physical database design determines how efficiently a query is executed in a DBMS The physical database design “problem” For a given workload, find a configuration i.e. a set of indexes and minimize the cost. All papers recognized the in-feasibility by performing analysis on “real” stuff Feedback Control Loop – “on-the-fly” physical design decisions Hypothetical scenarios – AutoAdmin’s “What-if” analysis, Self-tuning histograms Design Schemes Moving from Static indexes to dynamic indexes to dynamic index-structures Traditional histograms to self-tuning histograms Bridging the gap between monitoring and parameter adjustment The results: Self-managing DBMS, Low maintenance cost, Removal of manual error Performance guarantee (criticality)