Your SlideShare is downloading. ×
0
Tractor Pulling on Data Warehouse
Tractor Pulling on Data Warehouse
Tractor Pulling on Data Warehouse
Tractor Pulling on Data Warehouse
Tractor Pulling on Data Warehouse
Tractor Pulling on Data Warehouse
Tractor Pulling on Data Warehouse
Tractor Pulling on Data Warehouse
Tractor Pulling on Data Warehouse
Tractor Pulling on Data Warehouse
Tractor Pulling on Data Warehouse
Tractor Pulling on Data Warehouse
Tractor Pulling on Data Warehouse
Tractor Pulling on Data Warehouse
Tractor Pulling on Data Warehouse
Tractor Pulling on Data Warehouse
Tractor Pulling on Data Warehouse
Tractor Pulling on Data Warehouse
Tractor Pulling on Data Warehouse
Tractor Pulling on Data Warehouse
Tractor Pulling on Data Warehouse
Tractor Pulling on Data Warehouse
Tractor Pulling on Data Warehouse
Tractor Pulling on Data Warehouse
Tractor Pulling on Data Warehouse
Tractor Pulling on Data Warehouse
Tractor Pulling on Data Warehouse
Tractor Pulling on Data Warehouse
Tractor Pulling on Data Warehouse
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Tractor Pulling on Data Warehouse

921

Published on

This topic was presented by Martin Kersten (CWI) at the 4th International Workshop on Testing Database Systems (DBTest 2011) on June 13th, 2011 in Athens, Greece. …

This topic was presented by Martin Kersten (CWI) at the 4th International Workshop on Testing Database Systems (DBTest 2011) on June 13th, 2011 in Athens, Greece.

Publication: http://bit.ly/yK5JZk

Abstract:
Robustness of database systems under stress is hard to quantify, because there are many factors involved, most notably the user expectation to perform a job within certain bounds of the user requirements. Nevertheless, robustness of database system is very important to end users. In this paper we develop a database benchmark suite, inspired by tractor pulling, where robustness is measured as a system's ability to process data despite a continuous increase in system load, as defined in terms of data volume, query volume and complexity. A functional evaluation is performed against several systems to highlight the benchmark capabilities.

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
921
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Tractor Pulling on DatawarehousesMartin Kersten, Volker MarklMeikel Poess, Kai-Uwe Settler Alfons Kemper, Ani Nica, DBTest 2011
  • 2. The good old days• The early eighties when – Oracle appeared on the scene – Ingres was a respected innovator on RDBMS – System R fought the Codasyl battle – IMS was still dominating the market• There was a need for a metric to evaluate the solutions
  • 3. The good old days• Turned into an organised battle – TPC-C, TPC-H, TPC-D, TPC-W… – hundreds of benchmarks to proof one’s muscles
  • 4. • We need tools to assess a solution space• We don’t need weapons to win a ‘war’
  • 5. Dagstuhl 2010 Robust Query Processing
  • 6. • With each step in the pull the tension of the Tractor increases (exponentially)• The Tractor driver is throttling and changing gears to keep it going
  • 7. Ingredients of the DBMS Tractor Pull• A tractor pull is a series of workload steps for which we measure the performance• Each step is defined by – Catalog changes – Database load, delete+load+create index – Query processing, BI grouped statistics – Concurrency – Act of God operations
  • 8. A database soilGenerate a small database < RAMUse a single data type
  • 9. A database soilCOPY the smaller relation into the larger one Cop
  • 10. A database soil
  • 11. Query templateSELECT R0.B0, ...,Ri.Bi, count(*), avg(R0.B0),avg(R1.B0), avg(R1.B1),. . ., avg(Ri.B0), . . .FROM R0, . . . , RiWHERE selectpattern(R0, . . . , Ri) ANDjoinpattern(R0, . . . , Ri)GROUP BY R0.B0, . . . , Ri.BiORDER BY R0.B0, . . . , Ri.BiLinear, Cyclic, Star-based, Clique query patternsThe n-th query load includes the n-1 th query load
  • 12. Scenarios• Tractor pull workload• W(N) = < S, L, Pre, Qry, Post, qry, db> – Schema adjustments – Loading the database – Pre-optimization – Query execution – Post optimization – query characteristics – db growth function
  • 13. Hill scenario• The Hills scenario models a data warehouse that grows with a modest growth rate of g ∈ (0, 1) (e.g., g = 0.2).• It starts out from a main-memory focus until it overflows into a few disks.• It will highlight a system’s robustness to deal with the memory-disk
  • 14. Hill scenarioA modest growing warehouse with a single user.The database fits in memory and spills over to diskD ∈ (0%, 100%), G∈ (0, 1)Number of connections at track I : 1db(0) = (D x RAM) x ( 1 / (2 x dom) )db(i) = g x i x db(0)qry(0) = 1, qry(i) = 4|qry(i)| = 1 + 4 x i
  • 15. Meadow scenarioA stable warehouse with a multiple users.Query templates stress complexityd∈(0%,100%), g=0, C>1Number of connections at track i : Cdb(0) = (d × RAM) × (1) 2×domdb(i) = 0 (no growth)qry(0) = 0, qry(i) = C|Q(i)| = 1 + C × i
  • 16. Rockies scenarioA growing warehouse with a multiple users.Query templates stress complexityd∈(0%,100%), g∈ (0,10)Number of connections at track i : idb(0) = (d × RAM) × (1) 2×domdb(i) = g × i × db(0)qry(0) = 0, qry(i) = i × 4|Q(i)| = 1 + 4 × i (i+1)/2
  • 17. Robustness metrics• It is a multi-dimensional metric aimed at measuring the deviation from the expected norm• Robust(N)=<L, S, QO, QOk, QE, QEk, H> – Standard deviation of the loading time L – ,, Storage requirements – ,, Query optimization (per track – ,, Query execution (per track) – ,, Holistic
  • 18. A hill scenario
  • 19. A meadow Scenario
  • 20. A Rockies scenario
  • 21. Take aways• Robustness is all about comparisons. We need methods to quickly determine difference in behavior.• If the system reaches the end of the field we are happy. If it blows up or if the queries are behaving worse along the way it is not robust.
  • 22. Conclusions• Tractorpulling is an effective new toolkit for robustness testing a DBMS in various dimensions• Refinements for ease of analysis is needed (GUIs)• http://sourceforge.net/projects/tracto rpulling

×