Top 20 Design Tips for MySQL Data Architects
Upcoming SlideShare
Loading in...5
×
 

Top 20 Design Tips for MySQL Data Architects

on

  • 61,993 views

There are many poorly designed database application in use today. There are several reasons including lack of expertise, or lack of experience in MySQL specifically. ...

There are many poorly designed database application in use today. There are several reasons including lack of expertise, or lack of experience in MySQL specifically.

This presentation will aim to educate an attendee for the top 20 things every database design should have. As a consultant I see these common mistakes regularly, some of which can easily be corrected without any code changes.

Some topics include:

Why is normalization important? When do you optimize your normalized schema? Why NOT NULL is important? Why VARCHAR is bad? Disk = Memory = Performance? MySQL has 9 different Numeric data types, Oracle has 1. Why? Think transactions, think strict data integrity. Leverage the power of covering indexes. Query Cache, friend or enemy.

Statistics

Views

Total Views
61,993
Views on SlideShare
60,191
Embed Views
1,802

Actions

Likes
112
Downloads
2,146
Comments
5

52 Embeds 1,802

http://ronaldbradford.com 559
http://blog.macronimous.com 351
http://learnmysql.blogspot.com 295
http://90.0.0.225 219
http://www.slideshare.net 123
http://www.dbthink.com 60
http://next.ronaldbradford.com 33
http://www.scoop.it 17
http://priyeshpandya.blogspot.in 15
http://learnmysql.blogspot.in 14
https://blackboard.strayer.edu 10
http://feeds.feedburner.com 9
http://tutorailspoint.blogspot.com 9
http://priyeshpandya.blogspot.com 9
http://freelanceunited.in.th 6
http://www.linkedin.com 6
http://learnmysql.blogspot.ca 5
https://twitter.com 5
http://www.freelanceunited.in.th 5
http://www.techiegyan.com 5
http://sathishkrcs.byethost2.com 4
http://translate.googleusercontent.com 3
http://learnmysql.blogspot.fi 2
http://learnmysql.blogspot.com.au 2
http://learnmysql.blogspot.pt 2
http://localhost 2
http://www.plurk.com 2
http://sathishkrcs.hostbyet2.com 2
http://learnmysql.blogspot.co.il 2
http://learnmysql.blogspot.gr 2
http://static.slideshare.net 2
https://roman-beta.rebelmouse.com 2
http://blog.thinkdiff.net 1
http://learnvbdotnet.blogspot.in 1
http://learnmysql.blogspot.no 1
http://webcache.googleusercontent.com 1
http://xianguo.com 1
http://learnmysql.blogspot.com.es 1
http://bucket.tumblr.com 1
http://www.lmodules.com 1
http://static.slidesharecdn.com 1
http://www.biblios.ro 1
http://90.0.0.150:5051 1
http://designcodemobilesecurity.blogspot.com 1
http://www.e-presentations.us 1
http://hml-monqi.monqi.com.br 1
http://learnmysql.blogspot.co.uk 1
http://learnmysql.blogspot.com.ar 1
http://learnmysql.blogspot.fr 1
http://learnmysql.blogspot.mx 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • @yeshiming auto increment int is usually used as unique identifier. When you have 2^64 rows in your db, you surely need to separate it, or the performance sucks
    Are you sure you want to
    Your message goes here
    Processing…
  • Why is bigint not needed for auto increment?
    Are you sure you want to
    Your message goes here
    Processing…
  • Aye, good stuff. A lot of this actually applies to MSSQL design as well, and I'm sure many other database servers.
    Are you sure you want to
    Your message goes here
    Processing…
  • good!
    greate slide share in http://www.jianbaor.com
    Are you sure you want to
    Your message goes here
    Processing…
  • Very good!
    Even if I already use a lot of the tips, i learned some usefull ones.
    Thanx for the nice share :)
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Top 20 Design Tips for MySQL Data Architects Top 20 Design Tips for MySQL Data Architects Presentation Transcript

  • The Top 20 Design Tips The Top 20 Design Tips For MySQL Enterprise Data Architects Ronald Bradford Principal 42SQL April 2008 Presented By: Ronald Bradford Version 1.1 10.Apr.2008 www.ronaldbradford.com
  • The Top 20 Design Tips 1. Know Your Technology Tools ❖ Generics are inefficient ❖ Product expertise in a different RDBMS is not enough ❖ You have chosen MySQL ❖ Maximize it's strengths ❖ Minimize it's weaknesses Presented By: Ronald Bradford www.ronaldbradford.com
  • The Top 20 Design Tips Overview ❖ Table Structure ❖ SQL ❖ Indexes ❖ Enterprise Approaches Presented By: Ronald Bradford www.ronaldbradford.com
  • The Top 20 Design Tips 1. Know Your Technology Tools ❖ Maximize MySQL strengths ❖ Scale out / HA Options ❖ Different Storage Engines ❖ Query Cache ❖ Minimize MySQL weaknesses ❖ No Online Alter ❖ Backup Strategies ❖ Instrumentation Presented By: Ronald Bradford www.ronaldbradford.com
  • The Top 20 Design Tips 2. Know Your Disk Footprint Disk = Memory = Performance ❖ Every single byte counts ❖ Average 25% - 30% saving on engagements ❖ Better 60% (200GB System)‫‏‬ ❖ Best 78% (8GB per master with 12 masters)‫‏‬ Less disk accesses and more data in memory Presented By: Ronald Bradford www.ronaldbradford.com
  • The Top 20 Design Tips 3. Choose Your Numeric Data Type ❖ MySQL has 9 numeric data types ❖ Oracle for example has only 1 Presented By: Ronald Bradford www.ronaldbradford.com
  • The Top 20 Design Tips 3. Choose Your Numeric Data Type ❖ Integer: TINYINT, SMALLINT, MEDIUMINT, INT, BIGINT ❖ Floating Point: FLOAT, DOUBLE ❖ Fixed Point: DECIMAL ❖ Other: BIT, (ENUM maybe)‫‏‬ Presented By: Ronald Bradford www.ronaldbradford.com
  • The Top 20 Design Tips 3. Choose Your Numeric Data Type ❖ Favorite signs of poor design ❖ INT(1)‫‏‬ ❖ BIGINT AUTO_INCREMENT ❖ no UNSIGNED used ❖ DECIMAL(31,0)‫‏‬ Presented By: Ronald Bradford www.ronaldbradford.com
  • The Top 20 Design Tips 3. Choose Your Numeric Data Type ❖ INT(1) - 1 does not mean 1 digit ❖ (1) represents client output display format only ❖ INT is 4 Bytes, TINYINT is 1 Byte ❖ TINYINT UNSIGNED can store from 0 – 255 ❖ BIT is even better when values are 0 - 1 Presented By: Ronald Bradford www.ronaldbradford.com
  • The Top 20 Design Tips 3. Choose Your Numeric Data Type ❖ BIGINT is not needed for AUTO_INCREMENT ❖ INT UNSIGNED stores 4.3 billion values ❖ You should be partitioning when at billions of rows ❖ BIGINT is applicable for some columns ❖ e.g. summation of values Presented By: Ronald Bradford www.ronaldbradford.com
  • The Top 20 Design Tips 3. Choose Your Numeric Data Type Best Practice ❖ Best Practice ❖ All integer columns UNSIGNED unless there is a reason otherwise ❖ Adds a level of data integrity for negative values Presented By: Ronald Bradford www.ronaldbradford.com
  • The Top 20 Design Tips 4. Other Data Type Efficiencies ❖ TIMESTAMP v DATETIME ❖ Suitable for EPOCH only values ❖ TIMESTAMP is 4 bytes ❖ DATETIME is 8 bytes ❖ FYI: DATE is 3 bytes, TIME is 3 bytes = 6 Bytes??? Presented By: Ronald Bradford www.ronaldbradford.com
  • The Top 20 Design Tips 4. Other Data Type Efficiencies ❖ CHAR(n)‫‏‬ ❖ Use VARCHAR(n) for variable values ❖ e.g. CHAR(128) when storing ~10 bytes Presented By: Ronald Bradford www.ronaldbradford.com
  • The Top 20 Design Tips 5. Application Data Type Efficiencies ❖ Using Codes or ENUM ❖ A description is a presentation layer function ❖ e.g. 'M', 'F' instead of 'Male', 'Female' ❖ e.g. 'A', 'I' instead of 'Active', 'Inactive' ❖ BINARY(16/20)‫ ‏‬v CHAR(32/40) ❖ MD5() or HASH() Hex value with twice the length ❖ INT UNSIGNED for IPv4 address ❖ VARCHAR(15) results in average 12 bytes v 4 bytes Presented By: Ronald Bradford www.ronaldbradford.com
  • The Top 20 Design Tips 6. NOT NULL ❖ Saves up to a byte per column per row of data ❖ Double benefit for indexed columns ❖ Don't use frameworks or tools ❖ NOT NULL DEFAULT '' is bad design Best Always use NOT NULL unless Practice there is a reason why not Presented By: Ronald Bradford www.ronaldbradford.com
  • The Top 20 Design Tips 7. Know about character sets ❖ Default in MySQL 5 is UTF8 ❖ Can be defined at database, schema, table or column level ❖ Only define columns that need UTF8 ❖ e.g. Codes, MD5 Value, web address ❖ MySQL internal buffers are fixed width ❖ e.g. VARCHAR(255) utf8 is 765 bytes to store just 1 byte Presented By: Ronald Bradford www.ronaldbradford.com
  • The Top 20 Design Tips 8. When VARCHAR Is Bad ❖ VARCHAR(255)‫‏‬ ❖ Poor Design - No understanding of underlying data ❖ Old Design - ( 4.x limitation, now 3-4 years old)‫‏‬ ❖ Disk usage may be efficient ❖ MySQL internal memory usage is not Presented By: Ronald Bradford www.ronaldbradford.com
  • The Top 20 Design Tips 8. When VARCHAR is bad CREATE TABLE `XXX` ( `orderHandle` varchar(255) NOT NULL default '', `personName` varchar(255) default NULL, `addressLines` varchar(255) default NULL, `city` varchar(255) default NULL, `state` varchar(255) default NULL, `postalCode` varchar(255) default NULL, `countryCode` varchar(255) default NULL, `phone` varchar(255) default NULL, `email` varchar(255) default NULL, `shipMethod` varchar(255) default NULL, `shipTo` varchar(255) default NULL, `receiveByDate` date default NULL, `currency` varchar(3) default NULL, `price` varchar(255) default NULL, `flags` int(11) default '0', `lastUpdateTime` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP, `creationTime` timestamp NOT NULL default '0000-00-00 00:00:00', PRIMARY KEY (`orderHandle`) ) ENGINE=MyISAM DEFAULT CHARSET=utf8 Presented By: Ronald Bradford www.ronaldbradford.com
  • The Top 20 Design Tips 9. Be Wary of TEXT/BLOB ❖ Using SELECT * ❖ MySQL Internal Temporary table will force Temp Disk Table ❖ Internal storage (e.g. Innodb)‫‏‬ ❖ Stores first 768 bytes, then a separate 16k data page per row per TEXT/BLOB field Presented By: Ronald Bradford www.ronaldbradford.com
  • The Top 20 Design Tips 10. Know Every SQL Statement ❖ Developers don't write proper SQL statements ❖ SQL statements will directly affect your performance ❖ For Example ❖ Repeating SQL statements for no benefit ❖ 1000 very quick small unnecessary queries is worse then 1 slow query Presented By: Ronald Bradford www.ronaldbradford.com
  • The Top 20 Design Tips 10. Know Every SQL Statement ❖ Data collection options ❖ Incomplete Options ❖ Slow Query Log ❖ SHOW PROCESSLIST ❖ Application level logging ❖ Impractical Options ❖ General Log Presented By: Ronald Bradford www.ronaldbradford.com
  • The Top 20 Design Tips 10. Know Every SQL Statement ❖ Data collection options ❖ MySQL Proxy ❖ See histogram.lua ❖ Firewall forwarding rules Presented By: Ronald Bradford www.ronaldbradford.com
  • The Top 20 Design Tips 11. Monitor Every SQL Statement ❖ Review Query Execution Plan (QEP)‫‏‬ ❖ EXPLAIN ❖ Time queries ❖ Row Count / Affected rows ❖ Result Set Size Best Practice Review over time, things change Presented By: Ronald Bradford www.ronaldbradford.com
  • The Top 20 Design Tips 12. The Impact Of Indexes ❖ Good ❖ Dramatic performance improvements ❖ Improves memory usage ❖ Data Integrity ❖ Bad ❖ Slows performance for writes ❖ Wastes disk space for unused, duplicate or ineffective indexes ❖ In-effective usage of memory Presented By: Ronald Bradford www.ronaldbradford.com
  • The Top 20 Design Tips 13. Index Types For Design ❖ Concatenated Indexes ❖ (col1, col2) ❖ Partial Indexes ❖ (name(20)) ❖ Covering Indexes ❖ Full Text Indexes ❖ No function based indexes Presented By: Ronald Bradford www.ronaldbradford.com
  • The Top 20 Design Tips 14. Minimizing internal MySQL processing ❖ Correctly design tables, indexes and SQL to eliminate ❖ Using temporary table ❖ Using filesort Presented By: Ronald Bradford www.ronaldbradford.com
  • The Top 20 Design Tips 15. Transactions ❖ Always design for transactions ❖ Always use transactions ❖ Use a transactional storage engine Presented By: Ronald Bradford www.ronaldbradford.com
  • The Top 20 Design Tips 16. Data Integrity is Key ❖ MySQL historically has been very lax ❖ Warnings (e.g. Truncations) are rarely every caught ❖ SQL_MODE=STRICT_ALL_TABLES ❖ Within Schema ❖ NOT NULL ❖ ENUM ❖ UNSIGNED Presented By: Ronald Bradford www.ronaldbradford.com
  • The Top 20 Design Tips 17. Leverage The Query Cache ❖ Query Cache can be a great benefit ❖ Deterministic v Non Deterministic SQL Best Practice MySQL Query Cache is not the only type of caching you should consider Presented By: Ronald Bradford www.ronaldbradford.com
  • The Top 20 Design Tips 17. Leverage The Query Cache Presented By: Ronald Bradford www.ronaldbradford.com
  • The Top 20 Design Tips 17. Leverage The Query Cache ❖ SHOW PROFILE Path ❖ Simple SELECT ❖ No Query Cache 17 steps ❖ With Query Cache 5 steps ❖ Does not perform parse ❖ Does not perform optimize Presented By: Ronald Bradford www.ronaldbradford.com
  • The Top 20 Design Tips +--------------------------------+----------+---------------------------+---------------+-------------+ | Status | Duration | Source_function | Source_file | Source_line | +--------------------------------+----------+---------------------------+---------------+-------------+ | (initialization) | 0.000014 | send_result_to_client | sql_cache.cc | 1143 | | checking query cache for query | 0.000042 | open_tables | sql_base.cc | 2652 | | Opening tables | 0.000015 | mysql_lock_tables | lock.cc | 153 | | System lock | 0.000009 | mysql_lock_tables | lock.cc | 163 | | Table lock | 0.000034 | mysql_select | sql_select.cc | 2273 | | init | 0.000041 | optimize | sql_select.cc | 765 | | optimizing | 0.000008 | optimize | sql_select.cc | 924 | | statistics | 0.000016 | optimize | sql_select.cc | 934 | | preparing | 0.000012 | exec | sql_select.cc | 1594 | | executing | 0.000008 | exec | sql_select.cc | 2114 | | Sending data | 0.000163 | mysql_select | sql_select.cc | 2318 | | end | 0.000021 | mysql_execute_command | sql_parse.cc | 5141 | | query end | freeing items | closing tables Text | 0.000007 | query_cache_end_of_result | sql_cache.cc | | storing result in query cache | 0.000007 | mysql_parse | 0.000018 | dispatch_command | 0.000009 | log_slow_statement | sql_parse.cc | | sql_parse.cc | | sql_parse.cc | 735 | 6142 | 2146 | 2204 | | logging slow query | 0.000006 | dispatch_command | sql_parse.cc | 2169 | +--------------------------------+----------+---------------------------+---------------+-------------+ 17 rows in set (0.00 sec) +--------------------------------+----------+-----------------------+--------------+-------------+ | Status | Duration | Source_function | Source_file | Source_line | +--------------------------------+----------+-----------------------+--------------+-------------+ | (initialization) | 0.000012 | send_result_to_client | sql_cache.cc | 1143 | | checking query cache for query | 0.00001 | send_result_to_client | sql_cache.cc | 1224 | | checking privileges on cached | 0.000007 | send_result_to_client | sql_cache.cc | 1317 | | sending cached result to clien | 0.000025 | log_slow_statement | sql_parse.cc | 2204 | | logging slow query | 0.000007 | dispatch_command | sql_parse.cc | 2169 | +--------------------------------+----------+-----------------------+--------------+-------------+ 5 rows in set (0.00 sec) Presented By: Ronald Bradford www.ronaldbradford.com
  • The Top 20 Design Tips 18. Create Objects Appropriately ❖ Using 1 table instead of ʻnʼ for same column structure ❖ e.g. Code table for each type of code ❖ Splitting tables for optimal storage ❖ e.g. Placing optional TEXT/BLOB columns in second table ❖ Use permanent tables instead of TEMPORARY tables Presented By: Ronald Bradford www.ronaldbradford.com
  • The Top 20 Design Tips 19. Naming Standards ❖ Name all Primary Keyʼs Uniquely ❖ e.g. customer_id, order_id not id ❖ Use Data Dictionary SQL to verify data types ❖ Data Types & Lengths ❖ Be Descriptive ❖ e.g. invoice_date not just date ❖ Avoid Reserved Words ❖ e.g. date, time, timestamp Presented By: Ronald Bradford www.ronaldbradford.com
  • The Top 20 Design Tips 20. Testing, Testing, Testing ❖ You must have a testing environment ❖ Testing on a Production server is not an option Best Practice The goal of a testing environment is not to test your software, it is to break your software. Presented By: Ronald Bradford www.ronaldbradford.com
  • The Top 20 Design Tips Executive Summary ❖ Learn and know MySQL specifics ❖ Disk = Memory = Performance ❖ If you don't know your SQL you don't know your application. Log, Review & Monitor all SQL ❖ Know all benefits of different indexes ❖ You must test to failure in a dedicated test environment Presented By: Ronald Bradford www.ronaldbradford.com
  • The Top 20 Design Tips Professional Help is Available ❖ PrimeBase Technologies ❖ Technology Experts ❖ Solution Experts ❖ 2 decades Expertise & Experience in Enterprise RDBMS Data Architecture ❖ 9 years in MySQL www.ronaldbradford.com/contact Presented By: Ronald Bradford www.ronaldbradford.com