The document discusses various MySQL indexing concepts like primary key indexes, secondary indexes, clustered indexes and hash indexes. It explains how indexes are used based on the left prefix rule and selectivity. It also covers storage engines like InnoDB and MyISAM. The document discusses locking errors like lock wait timeouts and deadlocks. It explains isolation levels like repeatable read, read committed and serializable. It provides details about the Aurora undo log and how it differs from vanilla MySQL. It emphasizes monitoring MySQL using the error log, slow query log and metrics. It also briefly discusses Aurora parallel queries.
18. Index use
● Left prefix rule
○ Name, ID, country
■ Select … where ID = 3 and country = us
■ Select … where ID = 3
■ Select … where country = us
■ Select … where Name = ‘blah’ and country = us
■ Select … where Name = ‘blah’ and ID = 4
19. Index use
● Left prefix rule
○ Name, ID, country
■ Select … where ID = 3 and country = us
■ Select … where ID = 3
■ Select … where country = us
■ Select … where Name = ‘blah’ and country = us
■ Select … where Name = ‘blah’ and ID = 4
● Duplicate indexes
○ Name, country
○ Name, country, zip
○ Name, zip
20. Index use
● Selectivity rule
○ Select … where gender = male
○ Say 2ndary index on gender (cardinality is say 2 = male/female)
○ Say 70 % of rows are male
21. Index use
● Selectivity rule
○ Select … where gender = male
○ Say 2ndary index on gender (cardinality is say 2 = male/female)
○ Say 70 % of rows are male
○ => in the btree, 70% of rows will be under male node.
22. Index use
● Selectivity rule
○ Select … where gender = male
○ Say 2ndary index on gender (cardinality is say 2 = male/female)
○ Say 70 % of rows are male
○ => in the btree, 70% of rows will be under male node.
○ => since 2ndary index, the node has pointers to 70% of rows
23. Index use
● Selectivity rule
○ Select … where gender = male
○ Say 2ndary index on gender (cardinality is say 2 = male/female)
○ Say 70 % of rows are male
○ => in the btree, 70% of rows will be under male node.
○ => since 2ndary index, the node has pointers to 70% of rows
○ What does MySQL Do ?
24. Index use
● Selectivity rule
○ Select … where gender = male
○ Say 2ndary index on gender (cardinality is say 2 = male/female)
○ Say 70 % of rows are male
○ => in the btree, 70% of rows will be under male node.
○ => since 2ndary index, the node has pointers to 70% of rows
○ What does MySQL Do ?
■ Does it read index , traverse it and then go to disk for the 70% of data ?
25. Index use
● Selectivity rule
○ Select … where gender = male
○ Say 2ndary index on gender (cardinality is say 2 = male/female)
○ Say 70 % of rows are male
○ => in the btree, 70% of rows will be under male node.
○ => since 2ndary index, the node has pointers to 70% of rows
○ What does MySQL Do ?
■ Does it read index , traverse it and then go to disk for the 70% of data ?
■ It does not , it bypasses index and goes to disk directly
26. Index use
● Selectivity rule
○ Select … where gender = male
○ Say 2ndary index on gender (cardinality is say 2 = male/female)
○ Say 70 % of rows are male
○ => in the btree, 70% of rows will be under male node.
○ => since 2ndary index, the node has pointers to 70% of rows
○ What does MySQL Do ?
■ Does it read index , traverse it and then go to disk for the 70% of data ?
■ It does not , it bypasses index and goes to disk directly
■ It does a table scan ! (more effective).
● Explain might indicate use of index but in practice it does not!
31. Lock errors
1. Lock wait timeout exceeded:
a. set global innodb_lock_wait_timeout = x; <default is 150 sec i believe>
b. Show engine innodb status; (when txn is blocked, u can see on whats its blocked)
c. Show process list; (list of connections and what they are doing)
32. Lock errors
1. Lock wait timeout exceeded:
a. set global innodb_lock_wait_timeout = x; <default is 150 sec i believe>
b. Show engine innodb status; (when txn is blocked, u can see on whats its blocked)
c. Show process list; (list of connections and what they are doing)
2. Deadlocks
a. You have to do nothing. Auto resolved by Mysql - randomly 1 txn wins and other rolled back.
b. Show engine innodb status - shows u latest deadlocks that occurred
34. Read locks
1. Select for update
a. Use carefully.
i. You might end up locking part of the index tree. (select .. where cost > 50)
b. Good practice is to select row ids first and then update (i.e. specific row locks)
i. Select id where cost > 50
ii. Update where id = x
35. Read locks
1. Select for update
a. Use carefully.
i. You might end up locking part of the index tree. (select .. where cost > 50)
b. Good practice is to select row ids first and then update (i.e. specific row locks)
i. Select id where cost > 50
ii. Update where id = x
2. Select for share
a. It’s a pure read lock. Writers will wait for read to complete
38. Isolation levels
1. Repeatable read (default i believe in aurora)
a. The same read if done again in the txn sees the same thing (except if some other txn commits
before the second read)
39. Isolation levels
1. Repeatable read (default i believe in aurora)
a. The same read if done again in the txn sees the same thing (except if some other txn commits
before the second read)
2. Read Committed (newly added)
a. Every read in the txn sees the latest state
40. Isolation levels
1. Repeatable read (default i believe in aurora)
a. The same read if done again in the txn sees the same thing (except if some other txn commits
before the second read)
2. Read Committed (newly added)
a. Every read in the txn sees the latest state
3. Read un-Committed (not recommended)
a. A read in the txn sees the dirty state of uncommitted txns
41. Isolation levels
1. Repeatable read (default i believe in aurora)
a. The same read if done again in the txn sees the same thing (except if some other txn commits
before the second read)
2. Read Committed (newly added)
a. Every read in the txn sees the latest state
3. Read un-Committed (not recommended)
a. A read in the txn sees the dirty state of uncommitted txns
4. Serializable
a. All txns go in sequence
42. Why is this important ?
For every transaction:
43. Why is this important ?
For every transaction:
1. Rows updated:
a. Before version of rows is stored in undo-log. If txn is rolled back, the before version is restored.
44. Why is this important ?
For every transaction:
1. Rows updated:
a. Before version of rows is stored in undo-log. If txn is rolled back, the before version is restored.
b. Even newly inserted rows are stored in undo log.
45. Why is this important ?
For every transaction:
1. Rows updated:
a. Before version of rows is stored in undo-log. If txn is rolled back, the before version is restored.
b. Even newly inserted rows are stored in undo log.
c. Once txn completes (rolled back/committed) , the undo log purges the relevant rows.
46. Why is this important ?
For every transaction:
1. Rows updated:
a. Before version of rows is stored in undo-log. If txn is rolled back, the before version is restored.
b. Even newly inserted rows are stored in undo log.
c. Once txn completes (rolled back/committed) , the undo log purges the relevant rows.
2. Rows being read:
a. A snapshot is stored in the undo log.
47. Why is this important ?
For every transaction:
1. Rows updated:
a. Before version of rows is stored in undo-log. If txn is rolled back, the before version is restored.
b. Even newly inserted rows are stored in undo log.
c. Once txn completes (rolled back/committed) , the undo log purges the relevant rows.
2. Rows being read:
a. A snapshot is stored in the undo log.
b. It helps satisfy the isolation level of txn
48. Why is this important ?
For every transaction:
1. Rows updated:
a. Before version of rows is stored in undo-log. If txn is rolled back, the before version is restored.
b. Even newly inserted rows are stored in undo log.
c. Once txn completes (rolled back/committed) , the undo log purges the relevant rows.
2. Rows being read:
a. A snapshot is stored in the undo log.
b. It helps satisfy the isolation level of txn
c. Once txn completes (rolled back/committed) , the undo log purges the relevant rows.
49. Aurora Undo Log
● Normally in vanilla mysql,
○ each node has its own storage.
50. Aurora Undo Log
● Normally in vanilla mysql,
○ each node has its own storage.
● Storage is shared across nodes
○ => a single undo log that is shared between writer & reader
51. Aurora Undo Log
● Normally in vanilla mysql,
○ each node has its own storage.
● Storage is shared across nodes
○ => a single undo log that is shared between writer & reader
● Imagine long running transactions
52. Aurora Undo Log
● Normally in vanilla mysql,
○ each node has its own storage.
● Storage is shared across nodes
○ => a single undo log that is shared between writer & reader
● Imagine long running transactions
○ Based on isolation level
53. Aurora Undo Log
● Normally in vanilla mysql,
○ each node has its own storage.
● Storage is shared across nodes
○ => a single undo log that is shared between writer & reader
● Imagine long running transactions
○ Based on isolation level
○ The undo log might keep growing …
54. Aurora Undo Log
● Normally in vanilla mysql,
○ each node has its own storage.
● Storage is shared across nodes
○ => a single undo log that is shared between writer & reader
● Imagine long running transactions
○ Based on isolation level
○ The undo log might keep growing …
○ Purging / garbage collection will not occur…
55. Aurora Undo Log
● Normally in vanilla mysql,
○ each node has its own storage.
● Storage is shared across nodes
○ => a single undo log that is shared between writer & reader
● Imagine long running transactions
○ Based on isolation level
○ The undo log might keep growing …
○ Purging / garbage collection will not occur…
○ At some point, Paralysis due to overdue GC
56. Writes to Aurora & Cost
● Keep an eye on IOPS
○ IOPS ++ == $ ++
● Batch your writes if possible
● Compress your data before sending.
58. Monitoring
1. Never ignore mysql error logs. It might have something critical mentioned. Its
your best friend !
2. Can Enable slow query logs to keep track of slow running queries
3. Metrics
a. Recommend Percona PMM (available metrics are graphed nicely)
b. Buffer pool usage metrics
c. Undo log history growth / RollbackSegmentHistoryListLength metric
d. Insert latencies
e. IOPS usage
59. Aurora Parallel query
1. The only feature missing in other mysql variants.
2. Allows for parallelism in a select query
60. Aurora Parallel query
1. The only feature missing in other mysql variants.
2. Allows for parallelism in a select query
3. Bypasses the in-memory buffer pool doing table scans on disk. :)
61. Aurora Parallel query
1. The only feature missing in other mysql variants.
2. Allows for parallelism in a select query
3. Bypasses the in-memory buffer pool doing table scans on disk. :)
a. => IOPS => $ :)
62. Aurora Parallel query
1. The only feature missing in other mysql variants.
2. Allows for parallelism in a select query
3. Bypasses the in-memory buffer pool doing table scans on disk. :)
a. => IOPS => $ :)
4. Supposedly good for your reporting queries
63. Other helpful stuff
1. Use START TRANSACTION READ ONLY
; (less bookkeeping for readonly)
2. Run an explain on your query; be aware if index is used.
a. Explains are not always accurate
3. Show process list; (i used to kill long running transactions/ sleeping transactions - no
mercy :) )
4. Show engine innodb status;
5. You have an index on group by columns but order by columns not in index ?
6. Joining 2 tables - think of 2 for loops (keep outer for loop short)
7. Query Cache - apparently works well in aurora ! (discouraged in rds/mysql)
64. Finally ● Make 1 change at a time
○ Change
○ See effect
○ Make next change
65. Finally
● Make 1 change at a time
○ Change
○ See effect
○ Make next change
● Keep an eye on $ cost
66. Select QNS from you;
select Thank you from me;
Who am I ?
Ex Mysql Guy at Flipkart / Data guy at Trustana
linkedin.com/in/213vishnu/
mash213.wordpress.com/conferences/
https://twitter.com/sweetweet213