This document discusses 7 common database mistakes and how to avoid them. It begins by emphasizing the importance of proper backups and being able to restore data. It stresses having documentation and training others on restoration processes. The document also recommends keeping software updated for security reasons. It advises monitoring databases to understand performance and ensure uptime. Other mistakes covered include having inconsistent user permissions, not understanding indexing best practices, and not optimizing queries. The document concludes by promoting the benefits of using JSON columns in databases.
2. Hello!
I am Dave Stokes
MySQL Community Manager
Slides for this talk online at
slideshare.net/davidmstokes
2
3. 1. Backing up data != Safety
I do not care about your cron job, cute scripts, or any of that if what
you backup can not be used!
4. Backups are vital
Use mysqldump/mysqlpump, Xtrabackup, MySQL Enterprise Backup, etcetera
Async Replication, back up down stream server
LVM Snapshots
Something else in house
DO ALL THE ABOVE AND LOOK FOR MORE!!!
4
5. But restoration is more vital
Be able to restore:
• Entire server
• Schema
• Table
• Row
And make sure you are not the only one at your company with this knowledge
• Document
• Train
• Practice!!!!
5
7. And make sure you remember to backup
• Binary logs
• Stored Procedures
• Views
• Admin databases
• Encryption keys
• Configuration files (my.cnf, firewall, etc.)
And store them someplace safe
(minimum two places)
7
8. What is your company’s record retention policy????!
• Suppose your corporate lawyers ask (or are asked by a court) to show
what data you had on hand five years ago? (copy of data, database,
database software, hardware/software to support)
• Are you subject to the new European rules where may have to ‘forget’
data of individuals? What if the USA gets similar rules?
• Medical/Legal/Research rules?
8
9. 2. Keep on top of software updates
Do not skips updates
10. Keep your software updated!
• New features
• Bug Fixes
• Security vulnerabilities
Good for your resume!
(ask your local FoxPro
or dBase II professional)
10
11. And Keep a copy of the server
backup before the upgrade
around (and a copy from right
after the upgrade)
11
13. “Nobody ever bitches that the
database is running too fast!”
13
Dave’s First Law of Databases
14. What is your database doing RIGHT NOW?!?!?!?
1. RDMS
a. Queries
b. Logins/connections
c. Index management
d. Analytics
e. Overhead
f. Logging
g. Replication
2. Hardware
A. RAM
B. Buffers
C. Logins
D. Network traffic
E. Logging
F. Paging
G. Disk I/O
H. Overhead
14
15. Big concept
Does your paycheck and/or the paycheck of your
boss depend on a reliable database? Then make
sure you know what the #%@$ is is doing!!
15
16. Options
Enterprise Level
MySQL Enterprise Manager, Percona Manager, Solar Winds, etc
- Yes, they cost money
Instance Level
MySQL Workbench
PHPMyAdmin (Say hi to the Script Kiddies)
If you do decide to ‘roll your own’ you are not spending your time wisely. Your
time has value.
16
17. Options
Enterprise Level
MySQL Enterprise Manager, Percona Manager, Solar Winds, etc
- Yes, they cost money
Instance Level
MySQL Workbench
PHPMyAdmin (Say hi to the Script Kiddies)
If you do decide to ‘roll your own’ you are not spending your time wisely. Your
time has value. If you do roll your own do you do your own dentistry?
17
19. MySQL
Authentication is
Promiscuous
It is VERY easy to have the
same username with different
hosts (they are a pair) AND
different permissions for each of
those pairs!
19
20. Connection flow
• $mysql –u joe –p -> sent to server
• Server checks to see if host is okay to connect
• Server may prompt for authentication string
• User and auth string checked
• Is user over resource limits?
• Connection established -> prompt set to user
20
21. How you can get different privileges for ‘the same’ account
Joe @ 172.12.10.x
Read, write, update, delete
Joe @ 10.10.x.x
Read, write, update, delete, drop
Joe @ %
Everything!!
21
Joe’s developer account when he
joined the company
Joe’s account for use in the data
center
Joe’s account created at 3AM
fighting problems with legacy
application
22. How you can get different privileges for ‘the same’ account
Joe @ 172.12.10.x
Read, write, update, delete
Joe @ 10.10.x.x
Read, write, update, delete, drop
Joe @ %
EVERYTHING!!
22
Joe’s developer account when he
joined the company
Joe’s account for use in the data
center
Joe’s account created at 3AM
fighting problems with legacy
application
The Mysql daemon look
for the most generous
match First!!!!
23. Big concept
So someone guesses Joe’s password is ‘JoeISNumber1’ from outside
the 10.10 and 172.12 networks and ends up with privs for
EVERYTHING!!
23
25. Big concept
The cost based optimizer determines the cheapest
way to return the requested data, sort of like a
GPS, based on past query statistics and available
indexes.
And like a GPS it can be fooled
And may not have latest
information.
25
26. Full Table Scan
A Full Table Scan is when every row
of a table (file) needs to be read to
see if it match the search criteria.
SELECT name
FROM city WHERE
District='Texas';
There is no INDEX on the District
column
26
27. Index Scan
An index lets you go directly to the
matching record(s).
SELECT name
FROM city WHERE
CountryCode='USA';
There is an INDEX on the
CountryCode column so it does not
need to read all the rows!
27
29. Indexes should be considered a parallel table to your data table.
So each time you add, delete, or modify an indexed column, that parallel table
has to be updated TOO!
That overhead is expensive!!!
And the more options you have for indexes, the more the optimizer has to
consider -- That adds a factorial to the complexity for each possible index
29
Why you do not want to index every column!!
30. Indexing options
1. Compound indexes use more than one column
a. A Year-Month-Date index works for Y-M-D, Y-M, and Y
b. But not for D or M-D (leftmost only!)
2. For ‘multiple short’ columns, consider hashes
a.SELECT * FROM tbl_name WHERE
hash_col=MD5(CONCAT(val1,val2))
AND col1=val1 AND col2=val2;
1. Optimizer Hints
a. Comments in your query to direct optimizer
b.SELECT /*+ JOIN_ORDER(t1, t2) JOIN_PREFIX(t2, t1) */ ... FROM t1, t2;
30
31. Indexing options
1. Compound indexes use more than one column
1. A Year-Month-Date index works for Y-M-D, Y-M, and Y
2. But not for D or M-D (leftmost only!)
2. Index prefix of column for wide columns
3. For ‘multiple short’ columns, consider hashes
1. SELECT * FROM tbl_name WHERE
hash_col=MD5(CONCAT(val1,val2))
AND col1=val1 AND col2=val2;
4. Optimizer Hints
1. Comments in your query to direct optimizer
2. SELECT /*+ JOIN_ORDER(t1, t2) JOIN_PREFIX(t2, t1) */ ... FROM t1, t2;
31
32. Sys Schema -- Performance statistics
You can use MySQL Workbench to access the SYS Schema to see a list of
UNUSED indexes
• Do not look at after a reboot as there will not be enough usage data
collected
• Make sure it is not a rarely used index that is not vital at quarter or
year end
32
33. Histograms - Like Indexes without the overhead (for some data)
A histogram is an accurate representation of the distribution of numerical
data. For databases, a histogram is an approximation of the data
distribution within a specific column.
33
34. Creating a histogram
mysql> ANALYZE TABLE customer UPDATE HISTOGRAM ON c_mktsegment WITH 1024 BUCKETS;
+---------------+-----------+----------+---------------------------------------------------------+
| Table | Op | Msg_type | Msg_text |
+---------------+-----------+----------+---------------------------------------------------------+
| dbt3.customer | histogram | status | Histogram statistics created for column 'c_mktsegment'. |
+---------------+-----------+----------+---------------------------------------------------------+
34
35. Creating a histogram
mysql> ANALYZE TABLE customer UPDATE HISTOGRAM ON c_mktsegment WITH 1024 BUCKETS;
+---------------+-----------+----------+---------------------------------------------------------+
| Table | Op | Msg_type | Msg_text |
+---------------+-----------+----------+---------------------------------------------------------+
| dbt3.customer | histogram | status | Histogram statistics created for column 'c_mktsegment'. |
+---------------+-----------+----------+---------------------------------------------------------+
35
Do you know the cardinality of your indexes?
Hit rate?
Active data set size?
36. Invisible indexes
Invisible indexes make it possible to test the effect of removing an index on
query performance, without making a destructive change that must be
undone should the index turn out to be required. Dropping and re-adding an
index can be expensive for a large table, whereas making it invisible and
visible are fast, in-place operations.
ALTER TABLE t1 ALTER INDEX i_idx INVISIBLE;
ALTER TABLE t1 ALTER INDEX i_idx VISIBLE;
36
38. Query optimization is a skill, not a
black art. Even if you use an
ORM (lazy), you need to learn
how to use EXPLAIN and read a
query plan.
38
39. 1. Buy a copy of High Performance MySQL and/or the MySQL 5.0
Certification Guide and read sections on query optimizations
a. Both are dated but valuable
2. Learn to use VISUAL EXPLAIN from MySQL Workbench
3. Read DAILY your slow query log
a. Not all slow queries are bad (monthly report)
b. A query that was running well that now shows up on the slow
query log is BAD
4. Optimize the most frequently run queries FIRST
39
Best ways to learn how to optimize queries
41. Oracle, SQL Server, MySQL, PostgreSQL* have added JSON data types
● PG actually added two JSON data types
Vendors have added Native JSON data type
42. Mutability
JSON data type let you have a place
store rapidly evolving or undetermined
data.
JSON Data Types are extremely useful!
Replace many-to-many joins
If you have to do repeated index-
lookup/data-dives for ‘stub’ data
consider refactoring that data into
a JSON column
42
43. MySQl Document Store
• New API – The X Devapi
• New Protocol
• Allows you to use MySQL as a NoSQL JSON Document Store
• No need to set up relational tables, indexes, or normalize data
• Emphasis on CRUD – can use database before having ‘perfect’
knowledge of the data
• Easily mutable
• Works with relational tables too (mix & match)
43
44. JSON_Table
THE JSON_TABLE is a function that lets you temporarily turn unstructured JSON
data into a relational table for processing with SQL commands
select country_name,
IndyYear from countryinfo,
json_table(doc,"$" columns
(country_name char(20) path "$.Name",
IndyYear int path "$.IndepYear")) as stuff
where IndyYear > 1992;
+----------------+----------+
| country_name | IndyYear |
+----------------+----------+
| Czech Republic | 1993 |
| Eritrea | 1993 |
| Palau | 1994 |
| Slovakia | 1993 |
+----------------+----------+
44
45. Generated Columns from JSON data
If you find some of that JSON data is needed to be
used in SQL searches then extract that information
into its own column using a Generated Column
CREATE TABLE stuff
(c JSON,
g INT GENERATED ALWAYS AS (c->"$.id")),
INDEX i (g))
;
45
46. Please buy my book
If you are interested in using the JSON
data type with MySQL but find the
documentation hard to understand or
are just looking for a compact reference
guide, then you need my book!
46