• Like
  • Save
Database slide
Upcoming SlideShare
Loading in...5
×
Uploaded on

Database slide

Database slide

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
79
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. CS5226 Lecture 1 Introduction
  • 2. Database Tuning ◮ Make a database application run more quickly ◮ higher throughput ◮ lower response time ◮ Auto-tuning / self-tuning ◮ Better performance ◮ Easier manageability ◮ Query workload ◮ Online transaction processing (OLTP) ◮ Decision support systems (DSS) / Online analytical processing (OLAP) / Data warehousingCS5226: Sem 2, 2012/13 Introduction 2
  • 3. Anatomy of DBMS (Hellerstein, Stonebraker, Hamilton, 2007)CS5226: Sem 2, 2012/13 Introduction 3
  • 4. Query Optimization πX , Y , Z project X,Y,Z sort-merge join ⊲⊳R .B=T .B select R.X, S.Y, T.Z R.B = T.B from R, S, T hash join scan table where R.A = S.A ⊲⊳R .A=S.A T R.A = S.A for T and R.B = T.B R S scan index scan table on (A,Y) for R for S Internal Query Physical Query Representation Query PlanCS5226: Sem 2, 2012/13 Introduction 4
  • 5. Performance Tuning Knobs ◮ Schema tuning ◮ Query tuning ◮ Index & materialized view selection ◮ Statistics tuning ◮ Concurrency control tuning ◮ Data partitioning ◮ Memory tuning ◮ Hardware tuningCS5226: Sem 2, 2012/13 Introduction 5
  • 6. Schema Tuning CourseInfo Module Prof Room Building Time CS101 Turing LT 1 CS 0800 CS400 Turing LT 1 CS 1400 MU300 Bach LT 2 Math 1400 MA200 Newton LT 2 Math 1000 CS101 Turing LT 2 Math 1200 Schedule Course Room Time Module Facility Module Prof LT 1 0800 CS101 Room Building CS101 Turing LT 1 1400 CS400 LT 1 CS CS400 Turing LT 2 1400 MU200 LT 2 Math MU300 Bach LT 2 1000 MA200 MA200 Newton LT 2 1200 CS101CS5226: Sem 2, 2012/13 Introduction 6
  • 7. Query Tuning Q1: select c.cname from Customer c where 1000 < (select sum(o.totalprice) from Order o where o.cust# = c.cust#) Q2: select c.cname from Customer c join Order o on c.cust# = o.cust# group by c.cust#, c.cname having 1000 < sum(o.totalprice)CS5226: Sem 2, 2012/13 Introduction 7
  • 8. Query Tuning (cont.) Q3: select c.cust#, c.cname, sum(o.totalprice) as T from Customer c join Order o on c.cust# = o.cust# group by c.cust#, cname Q4: select c.cust#, cname, T from customer c, (select cust#, sum(totalprice) as T from Order group by cust#) as o where c.cust# = o.cust#CS5226: Sem 2, 2012/13 Introduction 8
  • 9. Query Tuning (cont.) Q5: select distinct R.A, S.X from R, S where R.B = S.Y Q6: select R.A, S.X from R, S where R.B = S.YCS5226: Sem 2, 2012/13 Introduction 9
  • 10. Index Tuning select A, B, C from R where 10 < A < 20 Access methods for selection queries: ◮ Table scan ◮ Use one or more indexesCS5226: Sem 2, 2012/13 Introduction 10
  • 11. B+-tree index 18 12 17 22 24 (Moe,10,55,180) (Larry,12,70,175) (Alice,17,48,175) (John,18,59,182) (Marcie,22,50,165) (Sally,24,48,169) (Curly,10,65,171) (Bob,15,60,178) (Lucy,17,45,170) (Charlie,20,69,173) (Linus,23,60,166) (Tom,25,56,176) Clustered index on (age) Relation R 59 name age weight height Moe 10 55 180 Curly 10 65 171 50 65 Larry 12 70 175 Bob 15 60 178 Alice 17 48 175 Lucy 17 45 170 (45, RID32) (50, RID51) (59, RID41) (65, RID12) John 18 59 182 (48, RID31) (55, RID11) (60, RID22) (69, RID42) Charlie 20 69 173 (48, RID61) (56, RID62) (60, RID52) (70, RID21) Marcie 22 50 165 Linus 23 60 166 Sally Tom 24 25 48 56 169 176 Unclustered index on (weight)CS5226: Sem 2, 2012/13 Introduction 11
  • 12. Index access methods ◮ Index scan ◮ Index seek [+ RID lookup ] ◮ Index intersection [+ RID lookup ]CS5226: Sem 2, 2012/13 Introduction 12
  • 13. Index scan select height from Student (175,48) (170,45) (178,60) (165,50, RID51) (170,45, RID32) (175,48, RID31) (178,60, RID22) (166,60, RID52) (171,65, RID12) (175,70, RID21) (180,55, RID11) (169,48, RID61) (173,69, RID42) (176,56, RID62) (182,59, RID41) Index on (height,weight)CS5226: Sem 2, 2012/13 Introduction 13
  • 14. Index seek select weight from Student where weight between 55 and 65 59 50 65 (45, RID32) (50, RID51) (59, RID41) (65, RID12) (48, RID31) (55, RID11) (60, RID22) (69, RID42) (48, RID61) (56, RID62) (60, RID52) (70, RID21) Index on (weight)CS5226: Sem 2, 2012/13 Introduction 14
  • 15. Index seek + RID lookups select weight select name from Student from Student where weight between 55 and 59 where weight between 55 and 59 and age ≥ 20 59 50 65 Index on (weight) (45, RID32) (50, RID51) (59, RID41) (65, RID12) (48, RID31) (55, RID11) (60, RID22) (69, RID42) (48, RID61) (56, RID62) (60, RID52) (70, RID21) (Moe,10,55,180) (Larry,12,70,175) (Alice,17,48,175) (Curly,10,65,171) (Bob,15,60,178) (Lucy,17,45,170) Data (Sally,24,48,169) (Marcie,22,50,165) (John,18,59,182) (Tom,25,56,176) (Linus,23,60,166) (Charlie,20,69,173) PagesCS5226: Sem 2, 2012/13 Introduction 15
  • 16. Index intersection select height, weight from Student where height between 164 and 170 and weight between 50 and 59 (175,48) 170 178 Index on (height) (165, RID51) (170, RID32) (175, RID31) (178, RID22) (166, RID52) (171, RID12) (175, RID21) (180, RID11) (169, RID61) (173, RID42) (176, RID62) (182, RID41) 59 50 65 Index on (weight) (45, RID32) (50, RID51) (59, RID41) (65, RID12) (48, RID31) (55, RID11) (60, RID22) (69, RID42) (48, RID61) (56, RID62) (60, RID52) (70, RID21)CS5226: Sem 2, 2012/13 Introduction 16
  • 17. Index Tuning Q1: select A, B, C from R where 10 < A < 20 and 20 < B < 100 Q2: select B, C, D from R where 50 < B < 100 and 60 < D < 80CS5226: Sem 2, 2012/13 Introduction 17
  • 18. Materialized View Tuning Q1: select R.B from R, S where R.A = S.X and S.Y > 100 MV1: select R.A, R.B, S.X, S.Y from R, S where R.A = S.X Q1’: select B from MV1 where Y > 100CS5226: Sem 2, 2012/13 Introduction 18
  • 19. Tuning of indexes & materialized views Given a query workload and a disk space constraint, what is the optimal configuration of indexes & materialized views to optimize the performance of the workload?CS5226: Sem 2, 2012/13 Introduction 19
  • 20. Tuning of statistics Examples of statistics: ◮ table cardinality ◮ statistics for each column: ◮ number of distinct values ◮ highest & lowest values ◮ frequent values ◮ data distribution statistics ◮ multi-column statistics Issues ◮ What statistics to collect? ◮ When to collect/refresh statistics?CS5226: Sem 2, 2012/13 Introduction 20
  • 21. Tuning of concurrency control ◮ Concurrency control protocols ◮ Two-phase locking ◮ Snapshot isolation ◮ Consistency vs concurrency tradeoff ◮ ANSI SQL isolation levels Dirty Unrepeatable Phantom Isolation Level Read Read Read READ UNCOMMITTED possible possible possible READ COMMITTED not possible possible possible REPEATABLE READ not possible not possible possible SERIALIZABLE not possible not possible not possibleCS5226: Sem 2, 2012/13 Introduction 21
  • 22. Tuning of memory ◮ How to optimize memory allocation? Oracle Memory Model (Dageville & Zait, VLDB 2002)CS5226: Sem 2, 2012/13 Introduction 22
  • 23. Data partitioning ◮ Increase data availability ◮ Decrease administrative cost ◮ Improve query performance Shared-nothing parallel DBMS (Hellerstein, et al., 2007)CS5226: Sem 2, 2012/13 Introduction 23
  • 24. References Additional Readings: ◮ J.M. Hellerstein, M. Stonebraker, J. Hamilton, Architecture of a Database System, Foundations and Trends in Databases, 1(2), 2007, 141-259.CS5226: Sem 2, 2012/13 Introduction 24