This document provides an agenda and summary for a MySQL conference session on clever SQL recipes and techniques for MySQL. The session will cover topics like unexpected sorting results, storing IP addresses efficiently, using auto-increment for multiple columns, generating random values, and transposing row data. The presenter is Damien Séguy, a MySQL expert consultant, who will demonstrate various techniques using a sample PHP statistics database schema. Attendees are encouraged to ask questions throughout the presentation.
2. Agenda
• Clever SQL recipes for MySQL
• Tweaking SQL queries
• You know about MySQL
• Really unexpected results ?
3. Agenda
• Solve every day problems
• Can be solved in more than one way
• Functionnality over speed
• Speed over functionnality
• May be solved from programming
language
4. Who's talking
• Damien Séguy
• Nexen.net editor
• MySQL Guild member
• Expert consulting with
nexenservices.com
• damien.seguy@nexen.net
• http://www.nexen.net/english.php
5. Scene : PHP statistics
• Applied to PHP Statistics schema
• Distributed system to track PHP
evolution
• Yes, data are real, recent and fun
• Available as download with the slides
• http://www.nexen.net/english.php
6. Don't wait till the end
• Tricks are like good jokes
• Feel free to answer
questions
• Feel free to ask questions
7. Funky sorting
• Given that both query and result are right :
• What sorts of sort is that?
mysql> SELECT id, rank FROM mce_1
ORDER BY rank ASC;
+----+--------+
| id | rank |
+----+--------+
| 1 | first |
| 2 | second |
| 3 | third |
| 4 | fourth |
+----+--------+
8. Funky sorting
• Enum is both a string and a number
• Internally used as an integer
• Compact storage, over 65000 values
• Displayed as string
mysql> CREATE TABLE `mce_1` (
`id` tinyint(11) NOT NULL,
`rank` enum('first','second','third','fourth'),
) ENGINE=MyISAM CHARSET=latin1;
10. Storing IP addresses
✦ Use INT UNSIGNED to store IP addresses
✦ Storage : 4 bytes / 15 chars (Unicode!)
✦ half-works with IP v6
✦ Efficient search with logical operators
✦ WHERE ip & INET_NTOA('212.0.0.0') =
INET_NTOA('212.0.0.0');
11. Other manipulations
✦ Works with any number of parts
✦ Don't go over 255 and don't come back
✦ Use it to compare versions
✦ just like version_compare() in PHP
✦ Use it to structure keys
✦ Beware of always-signed plat-forms
13. Auto_increment
• Not continuous
• Delete, insert, updates are allowed
• Not starting at 0
• Not incrementing + 1
• auto_increment_increment
• auto_increment_offset
15. Multi auto_increment
mysql> CREATE TABLE `mau` (
`idT` CHAR( 3 ) NOT NULL ,
`idN` INT UNSIGNED NOT NULL
AUTO_INCREMENT,
PRIMARY KEY ( `idT` , `idN` )
) ENGINE=MYISAM;
mysql> INSERT INTO `mau` (idT)
VALUES ('a'), ('a');
mysql> INSERT INTO `mau` (idT)
VALUES('b'), ('c');
mysql> INSERT INTO `mau` (idT)
VALUES ('a'), ('b'), ('c');
mysql> SELECT * FROM mau;
+-----+-----+
| idT | idN |
+-----+-----+
| a | 1 |
| a | 2 |
| b | 1 |
| c | 1 |
| a | 3 |
| b | 2 |
| c | 2 |
+-----+-----+
7 rows in set (0.00 sec)
16. Multi auto_increment
mysql> CREATE TABLE `mau_partition` (
`server` ENUM('a','b','c'),
`idN` INT NOT NULL UNSIGNED
AUTO_INCREMENT,
PRIMARY KEY ( `idT` , `idN` )
) ENGINE=MYISAM; mysql> SELECT *, inet_ntoa(pow
(2,24) * server + idN) AS id
FROM mau_partition;
+--------+-----+---------+
| server | idN | id |
+--------+-----+---------+
| a | 1 | 1.0.0.1 |
| a | 2 | 1.0.0.2 |
| a | 3 | 1.0.0.3 |
| b | 1 | 2.0.0.1 |
| b | 2 | 2.0.0.2 |
| c | 1 | 3.0.0.1 |
| c | 2 | 3.0.0.2 |
+--------+-----+---------+
• Partition data
• One central table
generating id
• Keep IP notation
• DayDream?
17. An integer table
• Always useful table
• Generate random values
• Check for missing values
• Use it as internal loops
18. An integer table
mysql> SHOW CREATE TABLE integers;
+-------------------------------------------------+
| CREATE TABLE |
+-------------------------------------------------+
| CREATE TABLE `integers` ( |
| `i` tinyint(3) unsigned DEFAULT NULL |
|) ENGINE=MyISAM DEFAULT CHARSET=latin1 |
+-------------------------------------------------+
1 row in set (0.00 sec)
mysql> INSERT INTO integers
VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
10 rows in set (0.00 sec)
19. An integer table
mysql> SELECT 10 * d.i + u.i
FROM integers u
CROSS JOIN integers d;
+----------------+
| 10 * d.i + u.i |
+----------------+
| 0 |
| 1 |
// ..................
| 98 |
| 99 |
+----------------+
100 rows in set (0.00 sec)
20. Missing values
mysql> SELECT i AS missing
FROM integers
LEFT JOIN mce_with_holes
ON integers.i = mce_with_holes.id
WHERE mce_with_holes.id IS NULL;
+------+
| id |
+------+
| 1 |
| 2 |
| 4 |
| 6 |
| 7 |
| 9 |
+------+
+---------+
| missing |
+---------+
| 0 |
| 3 |
| 5 |
| 8 |
+---------+
21. Missing values
mysql> SELECT
DATE_FORMAT(NOW()- interval i MONTH,'%m-%Y') p2,
IFNULL( LEFT(period, 7), 'Missing') period
FROM integers LEFT JOIN EvolutionByCountry
ON period =
DATE_FORMAT( now() - interval i MONTH, '%Y-%m-01')
AND tag = 'mv'
ORDER BY i DESC;
| 08-2006 | Missing |
| 09-2006 | 2006-09 |
| 10-2006 | Missing |
| 11-2006 | Missing |
| 12-2006 | Missing |
| 01-2007 | 2007-01 |
| 02-2007 | 2007-02 |
| 03-2007 | 2007-03 |
| 04-2007 | Missing |
22. | 19 | h |
| 20 | g |
| 21 | f |
| 22 | e |
| 23 | d |
| 24 | c |
| 25 | b |
| 26 | a |
+---------+--------+
Internal loops
mysql> SELECT rand() FROM integers WHERE i < 5;
mysql> SELECT d.i * 10 + u.i AS counter,
SUBSTR('abcdefghijklmnopqrstuvwxyz',
-1 * (d.i * 10 + u.i), 1) AS letter
FROM integers u, integers d
WHERE d.i * 10 + u.i BETWEEN 1 AND 26;
23. +---------+-------------------+
| i | ideogramm |
+---------+-------------------+
| 0 | 我 |
| 1 | 戒 |
| 2 | 戓 |
| 3 | 戔 |
| 4 | 戕 |
| 5 | 或 |
Internal loops
mysql> SELECT
i,
CHAR(15108241 + i) AS ideogramm
FROM integers u WHERE i < 6;
24. Random values
mysql> SELECT group_concat(char(rand() * 25 + 97)
SEPARATOR '' ) AS word
FROM integers AS l
JOIN integers AS w
WHERE l.i < rand() * 9 + 1
GROUP BY w.i;
+--------+
| word |
+--------+
| wwafq |
| zblhr |
| dxir |
| frh |
| yjzv |
| rrwg |
25. GROUP_CONCAT
• Concat() and concat_ws() :
now for groups
• Concatenate strings within GROUP BY
• ORDER BY
• SEPARATOR
• Limited to 1kb by default
• Change group_concat_max_len
26. Grouping strings
mysql> SELECT region.name,
group_concat(region.name
ORDER BY region.name
SEPARATOR ', ') subregions
FROM region JOIN region AS region2
ON region.id = region2.in
GROUP BY region.name ORDER BY region.name;
+------------+------------------------------------+
| name | subregions |
+------------+------------------------------------+
| California | Sacramento, San Diego, Santa Clara |
| Canada | British Colombia, Québec |
| USA | California |
+------------+------------------------------------+
27. Second last of mohican
mysql> SELECT period,
MAX(percentage) as first,
MID(group_concat(format(percentage, 5) order by
percentage desc separator ',' ), 10, 8) AS second,
MID(group_concat(format(percentage, 5) order by
percentage desc separator ',' ), 19,
locate(',', group_concat(format(percentage, 5)
order by percentage desc separator ',' ) ,20) - 19)
AS third
FROM VersionEvolution GROUP BY period;
+------------+--------------+----------+----------+
| period | first | second | third |
+------------+--------------+----------+----------+
| 2005-10-01 | 26.443662847 | 19.78355 | 9.21313 |
| 2005-11-01 | 24.351049557 | 18.89599 | 8.72828 |
28. Transposition
+-----+-------------------+
| uid | key | val |
+-----+-------------------+
| 1 | name | Smith |
| 1 | age | 22 |
| 1 | iq | 100 |
| 2 | name | John |
| 2 | age | 33 |
| 3 | name | Doe |
+-----+-------------------+
+------+-------+------+---------+
| uid | name | age | others |
+------+-------+------+---------+
| 1 | Smith | 22 | iq:100; |
| 2 | John | 33 | |
| 3 | Doe | | |
+------+-------+------+---------+
29. Transposition
mysql> SELECT uid,
group_concat(if(`key` = 'name',val, '')
SEPARATOR '' ) as name,
group_concat(if(`key` = 'age',val, '')
SEPARATOR '' ) as age,
group_concat(if(`key` != 'age' AND
`key` != 'name',
concat(`key`,':',val,';'), '')
SEPARATOR '' ) as others
FROM table
GROUP BY uid;
+------+-------+------+---------+
| uid | name | age | others |
+------+-------+------+---------+
| 1 | Smith | 22 | iq:100; |
| 2 | John | 33 | |
| 3 | Doe | | |
+------+-------+------+---------+
30. Separating columns
mysql> SELECT SUBSTR(col, 2 * i + 1 , 1) as v
FROM mce_col
JOIN integers
ON 2 * i + 1 <= length(col);
+---------+
| col |
+---------+
| a,b,c |
| e,a |
| c,d,e,f |
+---------+
+------+
| v |
+------+
| a |
| e |
| c |
| b |
| a |
| d |
| c |
| e |
| f |
+------+
31. Separating columns
mysql> SHOW CREATE TABLE mce_col_sep;
+-------------------------------------------------+
| CREATE TABLE |
+-------------------------------------------------+
| CREATE TABLE `mce_col_sep`( |
| `col` SET('a','b','c','d','e','f') |
|) ENGINE=MyISAM CHARSET=latin1 |
+-------------------------------------------------+
1 row in set (0.00 sec)
mysql> INSERT INTO mce_col_sep
SELECT id, col FROM mce_col;
3 rows in set (0.00 sec)
32. mysql> SELECT CONCAT(
"CREATE TABLE `mce_col_sep` (n `col` SET('",
GROUP_CONCAT(v SEPARATOR "','"),
"')n) ENGINE=MyISAM CHARSET=latin1")
AS `Create statement`
FROM (
SELECT
DISTINCT SUBSTR(col, 2 * i + 1 , 1) v
FROM mce_col
JOIN integers
ON 2 * i + 1 <= length(col))
subquery;
Creating table
• Beware of commas and figures!!
33. prompt> mysql -u R -D mce -B --skip-column-names -e "
SELECT CONCAT(
"CREATE TABLE `mce_col_sep` ( `col` SET('",
GROUP_CONCAT(v SEPARATOR "','"),
"')) ENGINE=MyISAM CHARSET=latin1")
AS `Create_statement`FROM (
SELECT DISTINCT SUBSTR(col, 2 * i + 1 , 1) v
FROM mce_col JOIN integers
ON 2 * i + 1 <= length(col))
subquery" | mysql -u root -D otherdb
Creating table
• Get the Query from mysql
• Feed it directly to MySQL
• Enjoy the fight with quotes
37. Us and the others
• Display statistics as pie
• Small shares are
• Unsignificant
• Hard to display
• Should be gathered
as 'Others'
38. Us and the others
mysql> SELECT version, percentage
FROM statsPHPmajor;
+---------+--------------+
| version | percentage |
+---------+--------------+
| 2 | 0.000291572 |
| 3 | 0.445930425 |
| 4 | 83.676435775 |
| 5 | 15.871029479 |
| 6 | 2.9157e-05 |
+---------+--------------+
39. Us and the others
• We need a criteria : < 1%
• IF() in a SQL query
• Change the version name on the fly
• Dynamically reduce the number of lines
with GROUP BY
• some lines stay alone, others get grouped
• SUM() gather all percentage in one
40. Us and the others
mysql> SELECT
IF(percentage >1, version, 'Others') AS version,
SUM(percentage) AS percentage
FROM statsPHPmajor
GROUP BY
IF (percentage > 1, version, 'Others');
+---------+--------------+
| version | percentage |
+---------+--------------+
| 4 | 83.516435775 |
| 5 | 15.871029479 |
| Others | 0.446251154 |
+---------+--------------+
41. Groups of one
• GROUP BY handles groups of one
COUNT(*) as number, AVG(), MEAN(), etc.
SUM(pourcentage) AS pourcentage,
// Conditionnal sum
SUM(if (col > 1, fractions, 0)) AS share,
// Product
EXP(SUM(LN(interest_rate))) AS composed,
// Homogenous group : STDDEV is 0
STDDEV(CRC32(text)) as homogenous
// Concatenation
GROUP_CONCAT(cols)
42. WITH ROLLUP
mysql> SELECT version, SUM(percentage)
FROM statsPHPmajor
GROUP BY version WITH ROLLUP;
+---------+-----------------+
| version | SUM(percentage) |
+---------+-----------------+
| 2 | 0.000291572 |
| 3 | 0.445930425 |
| 4 | 83.676435775 |
| 5 | 15.871029479 |
| 6 | 2.9157e-05 |
| NULL | 99.993716408 |
+---------+-----------------+
43. WITH ROLLUP
• GROUP BY modifier
• It will present intermediate values
• Even more interesting with several
columns
mysql> SELECT major, middle, minor,
FORMAT(SUM(percentage),2) as percentage
FROM statsPHPversions
GROUP BY major, middle, minor
WITH ROLLUP;
47. MySQL Variables
• Available since prehistoric times
• Handled on a connexion basis
• Destroyed upon disconnection
• No chance to step on other's values
• Globals
• Simultaneous assignement and usage
• Execution from left to right
48. MySQL variables
mysql> SELECT @total := SUM(number)
FROM statsPHPraw ;
mysql> INSERT INTO statsPHPversions
SELECT version, number / @total * 100
FROM statsPHPraw;
mysql> SELECT SUM(number)
FROM statsPHPraw;
// get 10107060 in a variable
mysql> INSERT INTO statsPHPversions
SELECT version, number / 10107060 * 100
FROM statsPHPraw;
49. MySQL variables
• Static SQL
• from the programming side, no more
need to build a SQL query on the fly
• Use them for better security and
readability
• Migrate toward stored procedures
• Another internal loop
51. Agile loading
✦ Change order
✦ Reformat data
✦ Ignore some of them
✦ Split values
✦ Add other values
✦ Add constants
03-Mar-07 71,12 Vanuatu Australia
03-Mar-07 33,34 USA North America
04-Mar-07 17,85 Israel Eurasia
+---------+
| Field |
+---------+
| id |
| period |
| country |
| php |
| rank |
+---------+
52. Agile loading
mysql> SET @i := 0;
mysql> LOAD DATA INFILE '/tmp/stats.txt'
INTO TABLE statPHPload
(@date, @php, @country, @continent)
SET
id = 0,
period = date(STR_TO_DATE(@date, '%d-%b-%y')),
rank = (@i := @i + 1),
php = CAST( REPLACE(@php, ',','.') as DECIMAL),
country = @country;
54. Ranking
mysql> SELECT country, php
FROM statsPHPcountry2
ORDER BY php DESC;
+----------------+------+
| country | php |
+----------------+------+
| Vanuatu | 71 |
| F. Polynesia | 68 |
| United Kingdom | 33 |
| USA | 33 |
| Greenland | 19 |
| Israel | 18 |
+----------------+------+
55. Ranking : one-pass
mysql> SET @rank := 0;
mysql> SELECT @rank := @rank + 1 AS rank,
country, php FROM statsPHPcountry2
ORDER BY php DESC;
+------+----------------+------+
| rank | country | php |
+------+----------------+------+
| 1 | Vanuatu | 71 |
| 2 | F. Polynesia | 68 |
| 3 | United Kingdom | 33 |
| 4 | USA | 33 |
| 5 | Greenland | 19 |
| 6 | Israel | 18 |
+------+----------------+------+
56. Ranking : ex-aequo
mysql> SET @rank := 0, @prev := NULL;
mysql> SELECT
@rank := if(@prev=php, @rank, @rank+ 1) AS rank,
country, @prev:= php AS php
FROM statsPHPcountry2
ORDER BY php DESC;
+------+----------------+-----+
| rank | country | php |
+------+----------------+-----+
| 1 | Vanuatu | 71 |
| 2 | F. Polynesia | 68 |
| 3 | United Kingdom | 33 |
| 3 | USA | 33 |
| 4 | Greenland | 19 |
| 5 | Israel | 18 |
+------+----------------+-----+
57. Final ranking
mysql> SET @num := 0, @rank := 0, @prev := NULL;
mysql> SELECT GREATEST(@num := @num + 1,
@rank := if(@prev != php, @num, @rank)) AS rank,
country, @prev := php AS php
FROM statsPHPcountry2
ORDER BY php DESC;
+------+----------------+------+
| rank | country | php |
+------+----------------+------+
| 1 | Vanuatu | 71 |
| 2 | F. Polynesia | 68 |
| 3 | United Kingdom | 33 |
| 4 | USA | 33 |
| 5 | Greenland | 19 |
| 6 | Israel | 18 |
+------+----------------+------+
58. Programming SQL
• Use LEAST/GREATEST to hide extra
assignements within the SQL
• those function accept arbitrary number
of arguments
• just choose carefully the one you need
• Don't turn your SQL into a full blown
program
59. UPDATE on SELECT
• Make an update, and select values at the
same time
• Like UPDATE on SELECT from InnoDB
• No need for transaction
• Available with MyISAM
60. Atomic queries
mysql> CREATE TABLE seq (id int unsigned);
mysql> INSERT INTO seq values (0);
mysql> UPDATE seq SET id = (@id := (id + 1) % 5);
mysql> UPDATE seq SET id = ((@id := id) + 1 % 5);
mysql> SELECT @id;
✦ Emulate sequences
✦ Not just auto_increment
✦ Cyclic ids, negative increment,
✦ strings, enum/set type
61. UPDATE on SELECT
mysql> SET @x := '';
mysql> UPDATE seq2 SET id =
GREATEST(id + 2, @x := CONCAT(letter ',',@x))
WHERE id % 2;
mysql> SELECT @x;
+------+--------+
| id | letter |
+------+--------+
| 0 | a |
| 3 | b |
| 2 | c |
| 5 | d |
| 4 | e |
| 7 | f |
+------------+
| @x |
+------------+
| a,c,e,g,i, |
+------------+
63. Obtaining top n rows
✦ Classic problem
✦ Use a temporary table and a join
✦ Use a subquery and a MySQL variable
mysql> SELECT *, MAX(col) FROM TABLE;
64. Obtaining top n rows
mysql> SET @num := 0, @rank := 0, @prev := NULL;
mysql> SELECT * from (
SELECT @rank:= if(@prev=tag,@rank+1,0) rank,
@prev := tag as country, period as month,
quantity FROM EvolutionByCountry
ORDER BY country, quantity
) AS t WHERE rank < 3;
+------+---------+------------+----------+
| rank | country | month | quantity |
+------+---------+------------+----------+
| 0 | us | 2007-01-01 | 33 |
| 1 | us | 2006-12-01 | 33 |
| 2 | us | 2007-02-01 | 33 |
+------+---------+------------+----------+
65. Word slicing
Documentation MySQL : this is the
documentation.
• Slicing text column into words
• Not just static length variables
• Words have different length
66. Word slicing
mysql> SET @a := 1, @b := 1;
mysql> SELECT * FROM (
SELECT i, @a,
@b := LEAST(
locate(' ', concat(manual, ' '), @a + 1),
locate(',', concat(manual, ','), @a + 1),
locate(':', concat(manual, ':'), @a + 1),
locate('.', concat(manual, '.'), @a + 1)
) as pos,
@b - @a AS length,
substr(manual, @a, @b - @a) as word,
@a := @b + 1
FROM integers, mysql_doc
WHERE @b < length(manual)) subquery
WHERE length > 1 OR
LOCATE(' ,:;.', word) > 0;
68. Adding chaos
• Extract random rows from a table
• SQL help sorting, not mixing!
• Lotery, random tests,
Cards dealing,
Genetic programming
• Can be done from programming langage
69. Adding chaos
mysql> SELECT col FROM tbl
WHERE SECOND(date) = floor(RAND() * 60)
LIMIT 10;
mysql> SELECT names FROM drivers
ORDER BY CRC32(CONCAT(names, NOW()));
mysql> SELECT id FROM tbl
WHERE id % 31 = 3
LIMIT 10;
• Know your data and use it as random
sources
70. Adding chaos
mysql> SELECT i FROM integers ORDER BY RAND();
+------+
| i |
+------+
| 5 |
| 8 |
| 7 |
| 4 |
| 1 |
| 9 |
| 6 |
| 3 |
| 2 |
| 0 |
+------+
10 rows in set (0.00 sec)
72. Using indexed chaos
• Store RAND() in extra column and index it
• Still use ORDER BY
• Use LIMIT offset from main program
• Update table once in a while
73. Adding indexed chaos
mysql> SELECT col FROM tbl ORDER BY chaos LIMIT 10;
Query OK, 10 rows affected (0.00 sec)
mysql> ALTER TABLE tbl ADD INDEX(x);
Query OK, 10000000 rows affected (28.69 sec)
mysql> UPDATE tbl SET chaos=RAND();
Query OK, 10000000 rows affected (3 min 40.53 sec)
74. Getting one random
mysql> SELECT id, cols FROM table JOIN
(SELECT CEIL(RAND() *
(SELECT MAX(i) FROM table)) AS r)
AS r2 ON id = r;
• Deepest sub-query is type const
• Sub-query do not use table
• id is an positive integer column
• auto_increment and continuous
75. Random and holes
mysql> CREATE TABLE holes (
table_id INT NOT NULL PRIMARY KEY,
sequence INT UNIQUE AUTO_INCREMENT);
mysql> INSERT IGNORE INTO holes
SELECT id, 0 FROM table;
mysql> SELECT id, cols FROM table
JOIN holes on table.id = holes.id
JOIN (SELECT CEIL(RAND() *
(SELECT MAX(i) FROM table)) AS r)
AS r2 ON id = r;
76. Several random?
• Plug with integer's table
• Add distinct to avoid doubles
• Beware of select too large subset of
integer
mysql> SELECT id, cols FROM table JOIN
(SELECT DISTINCT CEIL(RAND() *
(SELECT MAX(i) FROM table)) AS r
FROM integers WHERE i < 5)
AS rt2 ON id = r;
77. Timed lock
• LOCK TABLE
• Wait until it start working
• GET_LOCK('name', 3)
• System wide lock
• Wait 3 seconds then gives up
• Collaborative work
78. References
• MySQL Documentation, MySQL Press
• MySQL Cookbook by Paul Dubois,
O'reilly
• SQL Hacks by Andrew Cumming
and Gordon Russel, O'reilly
79. Great MySQL blogs
• Baron Schwartz
http://www.xaprb.com/blog/
• Giuseppe Maxia
http://datacharmer.blogspot.com/
• Sheeri Kritzer
http://sheeri.com/
• Roland Bouman
http://rpbouman.blogspot.com/
• Ronald Bradford
http://blog.arabx.com.au/
• Jan Kneschke
http://jan.kneschke.de/
HERE
AT THE
CONF!
80. Great MySQL blogs
• Morgan Tocker
http://www.tocker.id.au/
• MySQL Planet
http://www.planetmysql.org/
• Moosh et son Brol (Fr)
http://moosh.et.son.brol.be/
Not
here