Divide and Be Conquered?

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    1 Favorite

    Divide and Be Conquered? - Presentation Transcript

    1. Divide And Be Conquered? 04/23/2009 Brooks Johnson [email_address]
    2. Performance not guaranteed
      • Does not automatically improve performance
      • Often degrades performance
      • In theory could improve concurrency
      • First cover InnoDB
      • Then MyISAM
      • Finish up with admin improvements
    3. No Partition - Classic Primary Key CREATE TABLE `test`.`SaleO` ( `orderId` int(11) NOT NULL, `customerId` int(11) NOT NULL, `productId` int(11) NOT NULL, `productBigId` int(11) NOT NULL, `unit` int(11) NOT NULL, `purchaseAmount` decimal(16,2) NOT NULL, `purchaseCost` decimal(16,2) NOT NULL, `purchaseDate` datetime NOT NULL, PRIMARY KEY (`orderId`), KEY `idx_sale_purchasedate` (`purchaseDate`), KEY `idx_sale_product` (`productId`), KEY `idx_sale_customer` (`customerId`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
    4. Partitioning by Date (Month) CREATE TABLE `test`.`SaleP` ( `orderId` int(11) NOT NULL, `customerId` int(11) NOT NULL, `productId` int(11) NOT NULL, `productBigId` int(11) NOT NULL, `unit` int(11) NOT NULL, `purchaseAmount` decimal(16,2) NOT NULL, `purchaseCost` decimal(16,2) NOT NULL, `purchaseDate` datetime NOT NULL, PRIMARY KEY (`purchaseDate`,`orderId`), KEY `idx_sale_product` (`productId`), KEY `idx_sale_order` (`orderId`), KEY `idx_SaleP_orderId` (`orderId`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 PARTITION BY RANGE (to_days(purchaseDate)) (PARTITION p0 VALUES LESS THAN (730882) ENGINE = InnoDB, PARTITION p1 VALUES LESS THAN (730910) ENGINE = InnoDB, PARTITION p2 VALUES LESS THAN (730941) ENGINE = InnoDB, PARTITION p3 VALUES LESS THAN (730971) ENGINE = InnoDB, PARTITION p4 VALUES LESS THAN (731002) ENGINE = InnoDB, PARTITION p5 VALUES LESS THAN (731032) ENGINE = InnoDB, PARTITION p6 VALUES LESS THAN (731063) ENGINE = InnoDB, PARTITION p7 VALUES LESS THAN (731094) ENGINE = InnoDB, PARTITION p8 VALUES LESS THAN (731124) ENGINE = InnoDB, PARTITION p9 VALUES LESS THAN (731155) ENGINE = InnoDB, PARTITION p10 VALUES LESS THAN (731185) ENGINE = InnoDB, PARTITION p11 VALUES LESS THAN (731216) ENGINE = InnoDB, PARTITION p12 VALUES LESS THAN MAXVALUE ENGINE = InnoDB)
    5. Partition by Order CREATE TABLE `test`.`SaleOPO` ( `orderId` int(11) NOT NULL, `customerId` int(11) NOT NULL, `productId` int(11) NOT NULL, `productBigId` int(11) NOT NULL, `unit` int(11) NOT NULL, `purchaseAmount` decimal(16,2) NOT NULL, `purchaseCost` decimal(16,2) NOT NULL, `purchaseDate` datetime NOT NULL, PRIMARY KEY (`orderId`), KEY `idx_sale_purchasedate` (`purchaseDate`), KEY `idx_sale_product` (`productId`), KEY `idx_sale_customer` (`customerId`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 PARTITION BY HASH (orderID) PARTITIONS 12
    6. Sum Month without Partitioning explain partitions select sum(purchaseCost) from SaleO where purchaseDate >= '2001-12-01' and purchaseDate < '2002-01-01'G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: SaleO partitions: NULL type: range possible_keys: idx_sale_purchasedate key: idx_sale_purchasedate key_len: 8 ref: NULL rows: 18219104 Extra: Using where 1 row in set (0.00 sec)
      • Take 34 seconds to execute
    7. Sum Month with Date Partitions explain partitions select sum(purchaseCost) from SaleP where purchaseDate >= '2001-12-01' and purchaseDate < '2002-01-01' G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: SaleP partitions: p11,p12 type: range possible_keys: PRIMARY key: PRIMARY key_len: 8 ref: NULL rows: 4238758 Extra: Using where 1 row in set (0.00 sec)
      • Takes 20 seconds to run (34 non-partitioned)
      • Performance improvement due to clustering by date, not partitioning
    8. Sum Month with Order Partitions mysql> explain partitions select sum(purchaseCost) from SaleOPO -> where purchaseDate >= '2001-12-01' -> and purchaseDate < '2002-01-01' G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: SaleOPO partitions: p0,p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11 type: range possible_keys: idx_sale_purchasedate key: idx_sale_purchasedate key_len: 8 ref: NULL rows: 13863552 Extra: Using where 1 row in set (0.00 sec)
      • Takes 39 seconds to run
      • A bit longer than the non-partitioned table (34 seconds)
      • Partitioning might have caused a small overhead in this case
    9. Select orders 10,000 times explain partitions select unit from SaleO where orderId = 1 G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: SaleO partitions: NULL type: const possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: const rows: 1 Extra: 1 row in set (0.01 sec)
      • 55 seconds for the non-partitioned table
      • One process randomly selecting 10,000 orders
    10. Select orders 10,000 times explain partitions select unit from SaleP where orderId = 1 G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: SaleP partitions: p0,p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11,p12 type: ref possible_keys: idx_sale_order,idx_SaleP_orderId key: idx_sale_order key_len: 4 ref: const rows: 13 Extra: 1 row in set (0.17 sec)
      • 139 seconds for the data partitioned table
      • Over twice as long as the non-partitioned table (55 seconds)
    11. Non-Partitioned Index
    12. Partitioned Index
    13. Select orders 10,000 times explain partitions select unit from SaleOPO where orderId = 1 G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: SaleOPO partitions: p1 type: const possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: const rows: 1 Extra: 1 row in set (0.02 sec)
      • 57 seconds – same as no partition (well 2 seconds slower)
      • Partitioning might result in small overhead for each query
    14. Insert 800,000 rows into table
      • 8 processes each inserting 100,000 rows in parallel
      • Non-partitioned – 173 seconds
      • Partitioned by date – 300 seconds
      • All the data was added to the last date partition
      • Partitioned by order – 237 seconds
      • Data was added to all 12 partitions
      • Partitioning “seemed” to add overhead
    15. MyISAM without Partitioning CREATE TABLE `test`.`SaleI` ( `orderId` int(11) NOT NULL AUTO_INCREMENT, `customerId` int(11) NOT NULL, `productId` int(11) NOT NULL, `productBigId` int(11) NOT NULL, `unit` int(11) NOT NULL, `purchaseAmount` decimal(16,2) NOT NULL, `purchaseCost` decimal(16,2) NOT NULL, `purchaseDate` datetime NOT NULL, PRIMARY KEY (`orderId`), KEY `idx_sale_product` (`productId`), KEY `idx_sale_customer` (`customerId`), KEY `idx_SaleI_purchaseDate` (`purchaseDate`) ) ENGINE=MyISAM AUTO_INCREMENT=121900002 DEFAULT CHARSET=utf8;
    16. MyISAM Partitioning by Date CREATE TABLE `test`.`SaleIP` ( `orderId` int(11) NOT NULL AUTO_INCREMENT, `customerId` int(11) NOT NULL, `productId` int(11) NOT NULL, `productBigId` int(11) NOT NULL, `unit` int(11) NOT NULL, `purchaseAmount` decimal(16,2) NOT NULL, `purchaseCost` decimal(16,2) NOT NULL, `purchaseDate` datetime NOT NULL, PRIMARY KEY (`purchaseDate`,`orderId`), KEY `idx_sale_order` (`orderId`), KEY `idx_sale_product` (`productId`), KEY `idx_saleIP_customer` (`customerId`) ) ENGINE=MyISAM AUTO_INCREMENT=122200002 DEFAULT CHARSET=utf8 PARTITION BY RANGE (to_days(purchaseDate)) (PARTITION p0 VALUES LESS THAN (730882) ENGINE = MyISAM, PARTITION p1 VALUES LESS THAN (730910) ENGINE = MyISAM, PARTITION p2 VALUES LESS THAN (730941) ENGINE = MyISAM, PARTITION p3 VALUES LESS THAN (730971) ENGINE = MyISAM, PARTITION p4 VALUES LESS THAN (731002) ENGINE = MyISAM, PARTITION p5 VALUES LESS THAN (731032) ENGINE = MyISAM, PARTITION p6 VALUES LESS THAN (731063) ENGINE = MyISAM, PARTITION p7 VALUES LESS THAN (731094) ENGINE = MyISAM, PARTITION p8 VALUES LESS THAN (731124) ENGINE = MyISAM, PARTITION p9 VALUES LESS THAN (731155) ENGINE = MyISAM, PARTITION p10 VALUES LESS THAN (731185) ENGINE = MyISAM, PARTITION p11 VALUES LESS THAN (731216) ENGINE = MyISAM, PARTITION p12 VALUES LESS THAN MAXVALUE ENGINE = MyISAM)
    17. MyISAM Order Partitioned CREATE TABLE `test`.`SaleIPO` ( `orderId` int(11) NOT NULL AUTO_INCREMENT, `customerId` int(11) NOT NULL, `productId` int(11) NOT NULL, `productBigId` int(11) NOT NULL, `unit` int(11) NOT NULL, `purchaseAmount` decimal(16,2) NOT NULL, `purchaseCost` decimal(16,2) NOT NULL, `purchaseDate` datetime NOT NULL, PRIMARY KEY (`orderId`), KEY `idx_sale_purchaseDate` (`purchaseDate`), KEY `idx_sale_product` (`productId`), KEY `idx_sale_customer` (`customerId`) ) ENGINE=MyISAM AUTO_INCREMENT=120900003 DEFAULT CHARSET=utf8 PARTITION BY HASH (orderID) PARTITIONS 12
    18. Sum Month without Partitioning explain partitions select sum(purchaseCost) from SaleI where purchaseDate >= '2001-12-01' and purchaseDate < '2002-01-01' G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: SaleI partitions: NULL type: range possible_keys: idx_SaleI_purchaseDate key: idx_SaleI_purchaseDate key_len: 8 ref: NULL rows: 16447915 Extra: Using where 1 row in set (0.00 sec)
      • Take 14 seconds to execute
    19. Sum Month with Date Partitioned explain partitions select sum(purchaseCost) from SaleIP where purchaseDate >= '2001-12-01' and purchaseDate < '2002-01-01' G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: SaleIP partitions: p11,p12 type: ALL possible_keys: PRIMARY key: NULL key_len: NULL ref: NULL rows: 121300000 Extra: Using where 1 row in set (0.00 sec)
      • 5 seconds (14 seconds non-partitioned)
      • Real improvement
    20. Sum Month with Order Partition explain partitions select sum(purchaseCost) from SaleIPO where purchaseDate >= '2001-12-01' and purchaseDate < '2002-01-01' G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: SaleIPO partitions: p0,p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11 type: range possible_keys: idx_sale_purchaseDate key: idx_sale_purchaseDate key_len: 8 ref: NULL rows: 11695184 Extra: Using where 1 row in set (0.00 sec)
      • 14 seconds
      • No difference from non-partitioned
    21. Select orders 10,000 times explain partitions select unit from SaleI where orderId = 1 G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: SaleI partitions: NULL type: const possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: const rows: 1 Extra: 1 row in set (0.02 sec)
      • 96 seconds for non-partitioned
      • One process selecting 10,000 random orders
    22. Select orders 10,000 times explain partitions select unit from SaleIP where orderId = 1 G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: SaleIP partitions: p0,p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11,p12 type: ref possible_keys: idx_sale_order key: idx_sale_order key_len: 4 ref: const rows: 13 Extra: 1 row in set (0.00 sec)
      • 111 seconds for date partitioned (96 non-partitioned)
      • A bit worse than non-partitioned, but not that bad
    23. Select orders 10,000 times explain partitions select unit from SaleIPO where orderId = 1 G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: SaleIPO partitions: p1 type: const possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: const rows: 1 Extra: 1 row in set (0.00 sec)
      • 102 seconds for order partitioned
      • Worse than non-partitioned (96 non-partitioned)
      • Better than date partitioned (111 seconds)
      • Partitioning added overhead again
    24. Insert 800,000 rows into MyIsam
      • 8 processes each inserting 100,000 rows in parallel
      • Non-partitioned table takes 702 seconds
      • Date partitioned table takes 47 seconds
      • All the data was added to the last date partition
      • Order partitioned table takes 974 seconds
      • The data was added to all 12 orderId partitions
      • Date partitioned table was the fastest by far
      • So, real concurrency improvement in this case
    25. Partitioning Admin Improvements
      • ANALYZE PARTITION , CHECK PARTITION , OPTIMIZE PARTITION , REBUILD PARTITION , and REPAIR PARTITION
      • No longer use “OPTIMIZE TABLE SaleP”
      • Instead use “ALTER TABLE SaleP OPTIMIZE PARTITION p12”
      • Optimizing just the most recent partition can be over an order of magnitude faster than a full table optimization
      • One partition is far easier to fit into memory and much less data to sort
      • Dropping a partition is much faster than deleting rows
    26. Partitioning or Disk Striping Partitioning on Different Disks
    27. Partitioning
      • Not a turbo button
      • Can improve performance
      • Can degrade performance
      • Will improve administrative tasks
      • Performance depends on what is partitioned
      • Performance also depends on data distribution
      • Still more to learn
      • Any questions?

    + brooksaixbrooksaix, 6 months ago

    custom

    481 views, 1 favs, 0 embeds more stats

    MySQL Conference Presentation on the benefits and d more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 481
      • 481 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 1
    • Downloads 15
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories