2. OCP MySQL 5.0 - Oracle Certified Professional
Connect Me :
Facebook: https://www.facebook.com/avishekkumar.sharma.5
LinkedIn: http://in.linkedin.com/pub/avishek-kumar-sharma/31/798/509
Twitter: https://twitter.com/sharma_avishek
Google+: https://plus.google.com/103775319893123886681/posts
Blog:
http://avisheksharma.wordpress.com
Contact Me :
Email: avisheks@mindfiresolutions.com / avishekk111@gmail.com
Skype: mfsi_avisheks
About Me
3. 3
Today's Talk
1. Choose Correct engine
2. Data Types and Schema Design
3. Understanding Query Execution plan
4. Profiling
5. Role of Indexing
6. Optimizing GROUP BY/ORDER BY
7. Covering index
8. Scaling data
9. Conclusion
4. 4
Choose Correct engine
“You should use InnoDB for your tables unless you have a compelling need to use a different
engine” - High Performance MySQL by Peter Zaitsev
MyISAM vs InnoDBMyISAM vs InnoDB
InnoDB:
Transaction Support, Crash-safe
Row Level locking
Mix for mix of Updates and Select Statements
MyISAM:
Best for Read-heavy applications
Doesn't scale very well when there are a lot of writes.
Memory:
Best for small sized temp summary table, low latency, non-persistent
InnoDB also support “Full Text Search” in MySQL 5.6
5. 5
Data Types and Schema Design
1. Add `id` to each table ( id INT(11) UNSIGNED PRIMARY KEY AUTO_INCREMENT)
2. Smaller sizes are usually better
3. Have a discussion on NULL VS blank VS 0. (choose not null for KEY
columns).
4. CHAR vs VARCHAR ( Space Vs Performance)
6. 6
Data Types and Schema Design
5. ENUM : For storing strings values which have fixed and small sample
space use ENUM data type
6. Identifier for table joins should be of the same data type to
improve performance by reducing type conversion.
7. Normalization vs Denormalization (Faster Joins Vs Redundencies)
7. 7
Have a look what procedure analyse() 's saying.
MySQLClint(DB:as_01):select id from dochead procedure analyse()G
*************************** 1. row ***************************
Field_name: as_01.dochead.id
Min_value: 1
Max_value: 11294
Min_length: 1
Max_length: 5
Empties_or_zeros: 0
Nulls: 0
Avg_value_or_avg_length: 5678.8494
Std: 3270.5730
Optimal_fieldtype: SMALLINT(5) UNSIGNED NOT NULL
1 row in set (0.01 sec)
This is what MySQL suggesting
the datatype to be
8. 8
Query Execution Plan
Using the EXPLAIN keyword can give you insight on what
MySQL is doing to execute your query.
Eg: explain select col1 from <table_name> where col2=123 group by col3 order by col3;
9. 9
Explain some field values
Some of the important columns we gonna consider are:
●
type: const, eq_ref, ref, fulltext, range, index, ALL(sort order by speed)
●
possible_keys: all the candicates for the query
●
key: selected index (could be more than one)
●
key_len: index length (agg for composite indexes)
●
rows: no of approximate rows traversal
●
extra: using filesort, using index, using temporary,using where
10. 10
SET Profiling=ON
Profiling helps us analysing query resources
●
mysql> SET profiling=1;
●
mysql> run your query....
●
mysql> SHOW profiles;
●
mysql> SHOW profile [CPU] FOR QUERY 1;
11. 11
Role of Indexing
Index optimization is perhaps the most powerful way to
improve query performance.
When performance problems occur:
– Add indexes
– Rewrite your queries
– Or both
Do you need to fetch data (often on disk) ?
– If the index contains the data, you don't
– If you don't, your query is covered by an index (=index-only
query)
12. 12
Isolating the Column
MySQL generally can’t use indexes on columns unless the
columns are isolated in the query. “Isolating” the column means
it should not be part of an expression or be inside a function in
the query.
Wrong:
– mysql> SELECT actor_id FROM sakila.actor WHERE actor_id + 1 = 5;
Correct:
– mysql> SELECT actor_id FROM sakila.actor WHERE actor_id = 4;
13. 13
Keep queries deterministic(Query Cache)
Since the return result of the function can change, MySQL
decides to disable query caching for that query.
// query cache does NOT work
$r = mysql_query("SELECT username FROM user WHERE signup_date >=
CURDATE()");
// query cache works!
$today = date("Y-m-d");
$r = mysql_query("SELECT username FROM user WHERE signup_date >=
'$today'");
NB: This applies to all non-deterministic functions like NOW() and RAND() etc
14. 14
Column Prefix Indexes
● Smaller is usually better for obvious reasons
● Allow index BLOB/TEXT columns
CREATE TABLE user_address(
`id` INT UNSIGNED
...
...
`address` varchar(255)
PRIMARY KEY(`id`)
)
Eg: CREATE INDEX part_of_address ON user_address(address(10));
15. 15
Choose correct prefix length
● Check the full selectivity of your column.
● Check selectivity of several prefix lengths.
16. 16
Here we go
● Choose the prefix length having almost the same selectivity with
the full column selectivity. In the example, prefix length 7
selectivity is almost the same with the full column
selectivity so we will used it as the index.
● Create prefix indexes.
17. 17
Optimize GROUP BY/ORDER BY
● GROUP BY and ORDER BY queries do post-retrieval
work
● GROUP BY compels to create “Temp Table”
● and ORDER BY “File Sort”
● Indexing can help get rid of this work.
18. 18
Ignore GROUP BY sorting
● GROUP BY does implicit sorting
MySQLClint(DB:as_01)>
EXPLAIN
SELECT DISTINCT c.id,legal_name
FROM contractor c
JOIN dochead dh ON c.id=dh.contractor_to
WHERE
dh.system_id=623
AND invoice.doc_type=”invoice”
GROUP BY c.id
ORDER BY nullG
19. 19
Check EXPLAIN's Extra
*************************** 1. row ***************************
id: 1
type: ref
....
possible_keys: dochead_contractor_to,dochead_system_id
key: dochead_system_id
key_len: 4
ref: const
rows: 306
Extra: Using where; Using temporary; Using filesort
20. 20
Check EXPLAIN's Changes
*************************** 1. row ***************************
id: 1
type: ref
....
....
possible_keys: dochead_contractor_to,dochead_system_id
key: dochead_system_id
key_len: 4
ref: const
rows: 306
Extra: Using where; Using temporary; Using filesort
Using file sort will be vanished if
we dont bother for ordering
21. 21
Sometimes derived tables help
● Logic is less amount of data rows has to be traverse
● Filter results by cheapest first
● Limitation is Indexes cannot be used, and it has to be less
row size.
22. 22
Covering Index
● Indexes need to be designed for the whole query, not
just the WHERE clause. Contains (or “covers”) all the
data needed to satisfy a query.
Query with traditional index:Query with traditional index:
– Get right rows with indexGet right rows with index
– Get data from rowsGet data from rows
– Send data back to clientSend data back to client
Index-covered query:Index-covered query:
– Get right rows with indexGet right rows with index
– Get data from rowsGet data from rows
– Send data back to clientSend data back to client
23. 23
Case Study
SELECT sum(value)SELECT sum(value)
FROM Table1FROM Table1
WHERE item_id=? AND category_id = ?WHERE item_id=? AND category_id = ?
GROUP BY customer_id;GROUP BY customer_id;
ALTER TABLE Table1ALTER TABLE Table1
ADD INDEX t1_index(item_id,category_id,customer_id,value)ADD INDEX t1_index(item_id,category_id,customer_id,value)
24. 24
Check if used covering index
mysql> EXPLAIN SELECT sum(value)
FROM Table1
WHERE item_id=? AND category_id = ?
GROUP BY customer_id;
*************************** 1. row ***************************
table: Table1
.....
possible_keys: t1_index
key: t1_index
Extra: Using where;Using index
This signs that query used
covering index.
25. 25
Column Order in Covering Index
1. Const or equality
WHERE a = 5
2. Range
WHERE a = 5 AND b > 5
3. Order or Group By
WHERE a = 5 AND b > 5 GROUP BY c
WHERE a = 5 AND b > 5 ORDER BY c DESC
4. Select
count(d), sum(d)
26. 26
Lets put all together(Scaling)
●
Analyse explain and keep on trying/improving
– STRAIGHT_JOIN,
– FORCE INDEX/USE INDEX,
– IGNORE INDEX
●
Memory/Disk/CPU/Network
●
Upgrade MySQL Server
●
Move to Amazon RDS
– I strongly recommend, have personal experience