SlideShare a Scribd company logo
Character encoding
Breaking and unbreaking your data
Maciej Dobrzanski
maciek@psce.com | @mushupl
Brussels, 1 Feb 2015
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Character Encoding
• Binary representation of glyphs
• Each character can be represented by 1 or more bytes
• Popular schemes
• ASCII
• Unicode
• UTF-8, UTF-16, UTF-32
• Language specific character sets
• US (Latin US)
• Europe (Latin 1, Latin 2)
• Asia (EUC-KR, GB18030)
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Character Encoding
• Character set defines the visual interpretation of binary information
• One glyph can be associated with several numeric codes
• One numeric code may be used to represent several different glyphs
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Please state the nature of the emergency
• Application configuration
• Database configuration
• Table/column definitions
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #1: We are all born Swedish
• MySQL uses latin1 by default
• MySQL 5.7 too
• Is anyone actually aware of that?
• Why Swedish?
• latin1_swedish_ci is the default collation
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #1
• Let’s build an application
mysql> SELECT @@global.character_set_server, @@session.character_set_client;
+-------------------------------+--------------------------------+
| @@global.character_set_server | @@session.character_set_client |
+-------------------------------+--------------------------------+
| latin1 | latin1 |
+-------------------------------+--------------------------------+
1 row in set (0.00 sec)
mysql> CREATE SCHEMA fosdem;
Query OK, 1 row affected (0.00 sec)
mysql> USE fosdem;
mysql> CREATE TABLE locations (city VARCHAR(30) NOT NULL);
Query OK, 0 rows affected (0.15 sec)
mysql> SHOW CREATE TABLE locationsG
*************************** 1. row ***************************
Table: locations
Create Table: CREATE TABLE `locations` (
`city` varchar(30) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #1
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #1
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #1
• Everything is correct… NOT!
mysql> SET NAMES utf8;
Query OK, 0 rows affected (0.00 sec)
mysql> select * from locations;
+--------------------+
| city |
+--------------------+
| Berlin |
| Kraków |
| 東京都 |
+--------------------+
3 rows in set (0.00 sec)
mysql> SET NAMES latin1;
Query OK, 0 rows affected (0.00 sec)
mysql> select * from locations;
+-----------+
| city |
+-----------+
| Berlin |
| Kraków |
| 東京都 |
+-----------+
3 rows in set (0.00 sec)
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #1
• Let’s fix this
• Or can we ignore it?
• Ruby may not like it
# grep character-set-server /etc/mysql/my.cnf
character-set-server = utf8
mysql> SELECT @@global.character_set_server, @@session.character_set_client;
+-------------------------------+--------------------------------+
| @@global.character_set_server | @@session.character_set_client |
+-------------------------------+--------------------------------+
| utf8 | utf8 |
+-------------------------------+--------------------------------+
1 row in set (0.00 sec)
...we are fixing our tables here...
mysql> SHOW CREATE TABLE locationsG
*************************** 1. row ***************************
Table: locations
Create Table: CREATE TABLE `locations` (
`city` varchar(30) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #1: The good news
• It’s usually fixable
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #2: Settings, defaults, inheritance
• Where do you set character sets in MySQL?
• Sesssion settings
• character_set_server
• character_set_client
• character_set_connection
• character_set_database
• character_set_result
• Schema level defaults
• Table level defaults
• Column charsets
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #2
• Having fixed our problem #1, we continue to develop our application
mysql> SELECT @@session.character_set_server, @@session.character_set_client;
+--------------------------------+--------------------------------+
| @@session.character_set_server | @@session.character_set_client |
+--------------------------------+--------------------------------+
| utf8 | utf8 |
+--------------------------------+--------------------------------+
1 row in set (0.00 sec)
mysql> USE fosdem;
mysql> CREATE TABLE people (first_name VARCHAR(30) NOT NULL, last_name VARCHAR(30) NOT NULL);
Query OK, 0 rows affected (0.13 sec)
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #2
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #2
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #2
• Why is the table character set latin1?
mysql> SELECT @@session.character_set_server, @@session.character_set_client;
+--------------------------------+--------------------------------+
| @@session.character_set_server | @@session.character_set_client |
+--------------------------------+--------------------------------+
| utf8 | utf8 |
+--------------------------------+--------------------------------+
1 row in set (0.00 sec)
mysql> USE fosdem;
mysql> SHOW CREATE TABLE peopleG
*************************** 1. row ***************************
Table: people
Create Table: CREATE TABLE `people` (
`first_name` varchar(30) NOT NULL,
`last_name` varchar(30) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #2
• What’s all this, then?
mysql> SHOW SESSION VARIABLES LIKE 'character_set_%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)
mysql> SHOW CREATE DATABASE fosdemG
*************************** 1. row ***************************
Database: fosdem
Create Database: CREATE DATABASE `fosdem` /*!40100 DEFAULT CHARACTER SET latin1 */
1 row in set (0.00 sec)
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #2
• Can we fix this?
mysql> SET NAMES utf8;
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT last_name, HEX(last_name) FROM people;
+------------+----------------------+
| last_name | HEX(last_name) |
+------------+----------------------+
| Lemon | 4C656D6F6E |
| Müller | 4DFC6C6C6572 |
| Dobrza?ski | 446F62727A613F736B69 |
+------------+----------------------+
3 rows in set (0.00 sec)
mysql> SET NAMES latin2;
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT last_name, HEX(last_name) FROM people;
+------------+----------------------+
| last_name | HEX(last_name) |
+------------+----------------------+
| Lemon | 4C656D6F6E |
| Müller | 4DFC6C6C6572 |
| Dobrza?ski | 446F62727A613F736B69 |
+------------+----------------------+
3 rows in set (0.00 sec)
• We can’t! :-(
• 0x3F is '?', so my 'ń' was lost
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #2: The bad news
• It may not be enough to configure the server correctly
• A mismatch between client and server can permantenly break data
• Implicit conversion inside MySQL server
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #2: Settings, defaults, inheritance
• Where do you set character sets in MySQL?
• Sesssion settings
• character_set_server
• character_set_client
• character_set_connection
• character_set_database
• character_set_result
• Schema level defaults – affect new tables
• Table level defaults – affect new columns
• Column charsets
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #2: Settings, defaults, inheritance
master [localhost] {msandbox} ((none)) > SELECT @@global.character_set_server, @@session.character_set_client;
+-------------------------------+--------------------------------+
| @@global.character_set_server | @@session.character_set_client |
+-------------------------------+--------------------------------+
| latin1 | utf8 |
+-------------------------------+--------------------------------+
1 row in set (0.00 sec)
master [localhost] {msandbox} ((none)) > CREATE SCHEMA fosdemG
Query OK, 1 row affected (0.00 sec)
master [localhost] {msandbox} ((none)) > SHOW CREATE SCHEMA fosdemG
*************************** 1. row ***************************
Database: fosdem
Create Database: CREATE DATABASE `fosdem` /*!40100 DEFAULT CHARACTER SET latin1 */
1 row in set (0.00 sec)
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #2: Settings, defaults, inheritance
master [localhost] {msandbox} ((none)) > USE fosdem;
Database changed
master [localhost] {msandbox} (fosdem) > CREATE TABLE test (a VARCHAR(300), INDEX (a));
Query OK, 0 rows affected (0.62 sec)
master [localhost] {msandbox} (fosdem) > SHOW CREATE TABLE testG
*************************** 1. row ***************************
Table: test
Create Table: CREATE TABLE `test` (
`a` varchar(300) DEFAULT NULL,
KEY `a` (`a`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #2: Settings, defaults, inheritance
master [localhost] {msandbox} (fosdem) > ALTER TABLE test DEFAULT CHARSET = utf8;
Query OK, 0 rows affected (0.08 sec)
Records: 0 Duplicates: 0 Warnings: 0
master [localhost] {msandbox} (fosdem) > SHOW CREATE TABLE testG
*************************** 1. row ***************************
Table: test
Create Table: CREATE TABLE `test` (
`a` varchar(300) CHARACTER SET latin1 DEFAULT NULL,
KEY `a` (`a`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Problem #2: Settings, defaults, inheritance
master [localhost] {msandbox} (fosdem) > ALTER TABLE test ADD b VARCHAR(10);
Query OK, 0 rows affected (0.74 sec)
Records: 0 Duplicates: 0 Warnings: 0
master [localhost] {msandbox} (fosdem) > SHOW CREATE TABLE testG
*************************** 1. row ***************************
Table: test
Create Table: CREATE TABLE `test` (
`a` varchar(300) CHARACTER SET latin1 DEFAULT NULL,
`b` varchar(10) DEFAULT NULL,
KEY `a` (`a`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
I f**ckd up. What do I do?
• Let’s start with what you shouldn’t do
• Keep calm and don’t start by changing something
• Analyze the situation
• Why did the problem occur in the first place?
• Reassess the damage
• Is it consistent?
• Are all rows broken in the same way?
• Are some rows bad, but others are okay?
• Are all bad in several different ways?
• Is it actually repearable?
• No character mapping occurred during writes (e.g. unicode over latin1/latin1)
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
I f**ckd up. What else I shouldn’t do, then?
• Do not rush things as you may easily go from bad to worse
• Do not start fixing this on a replication slave
• You can’t fix this by fixing tables one by one on a live database
• Unless you really have everything in one table
• Do not use: ALTER TABLE … DEFAULT CHARSET = …
• It only changes the default character set for new columns
• Do not use: ALTER TABLE … CONVERT TO CHARACTER SET …
• It’s not for fixing broken encoding
• Do not use: ALTER TABLE … MODIFY col_name … CHARACTER SET …
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
I f**ckd up. So how do I fix it?
• What needs to be fixed?
• Schema defaut character set
• ALTER SCHEMA fosdem DEFAULT CHARSET = utf8
• Tables with text columns: CHAR, VARCHAR, TEXT, TINYTEXT, LONGTEXT
• What about ENUM?
• Use INFORMATION_SCHEMA to grab a list
• What about other tables?
• They too (eventually), but it’s not critical
SELECT CONCAT(c.table_schema, '.', c.table_name) AS candidate_table
FROM information_schema.columns c
WHERE c.table_schema = 'fosdem'
AND c.column_type REGEXP '^(.*CHAR|.*TEXT|ENUM)((.+))?$'
GROUP BY candidate_table;
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
I f**ckd up. So how do I fix it?
• Option 1 – requires downtime
• Dump and restore
• Dump the data preserving the bad configuration and drop the old database
bash# mysqldump -u root -p --skip-set-charset --default-character-set=latin1 fosdem >
fosdem.sql
mysql> DROP SCHEMA fosdem;
• Correct table definitions in the dump file
• Edit DEFAULT CHARSET in all CREATE TABLE statements
• Create the database again and import the data back
mysql> CREATE SCHEMA fosdem DEFAULT CHARSET utf8;
bash# mysql -u root -p --default-character-set=utf8 fosdem < fosdem.sql
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
I f**ckd up. So how do I fix it?
• Option 2 – requires downtime
• Perform a two step conversion with ALTER TABLE
• Original encoding -> VARBINARY/BLOB -> Target encoding
• Conversion from/to BINARY/BLOB removes character set context
• How?
• Stop applications
• On each tabe, for each text column perform:
ALTER TABLE tbl MODIFY col_name VARBINARY(255);
ALTER TABLE tbl MODIFY col_name VARCHAR(255) CHARACTER SET utf8;
• You may specify multiple columns per ALTER TABLE
• Fix the problems (application and/or db configs)
• Restart applications
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
I f**ckd up. So how do I fix it?
• Option 3 – online character set fix; no downtime*
• Thanks to our plugin for pt-online-schema-change
• and a tiny patch for pt-online-schema-change that goes with the plugin 
• How?
• Start pt-online-schema-change on all tables – one by one
• Do not rotate tables (--no-swap-tables) or drop pt-osc triggers
• Wait until all tables have been converted
• Stop applications
• Fix the problems (application and/or db configs)
• Rotate tables – takes just 1 minute
• Restart applications
• Et voilà
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
GOTCHAs!
• Data space requrements may change during conversion
• Latin1 uses 1 byte per character, utf8 will need to assume 3 bytes
• VARCHAR/TEXT fit up to 64KB – it won’t fit 65536 multi-byte characters
• Key length limit is 767 bytes
• Data type and/or index length changes may be required
• Test and plan this ahead
• There may be more prolems than you think
• Detect irrecoverible problems with a simple stored procedure
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
CREATE FUNCTION `cnv_test_conversion` (`value_before` LONGTEXT, `value_after` LONGTEXT) RETURNS tinyint(1)
BEGIN
RETURN (IFNULL(CONVERT(CONVERT(`value_before` USING latin1) USING binary), "") =
IFNULL(CONVERT(`value_after` USING binary), ""));
END;;
01.02.2015 Follow us on Twitter @dbasquare Need help? Visit www.psce.com
GOTCHAs!
master [localhost] {msandbox} (fosdem) > ALTER TABLE test MODIFY a VARCHAR(300) CHARACTER SET utf8;
Query OK, 0 rows affected, 1 warning (1.23 sec)
Records: 0 Duplicates: 0 Warnings: 1
master [localhost] {msandbox} (fosdem) > SHOW WARNINGSG
*************************** 1. row ***************************
Level: Warning
Code: 1071
Message: Specified key was too long; max key length is 767 bytes
1 row in set (0.00 sec)
master [localhost] {msandbox} (fosdem) > SHOW CREATE TABLE testG
*************************** 1. row ***************************
Table: test
Create Table: CREATE TABLE `test` (
`a` varchar(300) DEFAULT NULL,
`b` varchar(10) DEFAULT NULL,
KEY `a` (`a`(255))
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
How to do it right?
• Set character-set-server during initial configuration
• When creating new schemas, always specify the desired charset
• CREATE SCHEMA fosdem DEFAULT CHARSET = utf8
• ALTER SCHEMA fosdem DEFAULT CHARSET = utf8
• When creating new tables, also explicitly specify the charset
• CREATE TABLE people (…) DEFAULT CHARSET = utf8
• And don’t forget to configure applications too
• You can try to force charset on the clients
• init-connect = "SET NAMES utf8"
• It might also break applications that don’t want to talk to MySQL using utf8
01.02.2015 Follow us on Twitter @dbasquare www.psce.com
Oh, and one more thing…
01.02.2015 Follow us on Twitter @dbasquare Need help? Visit www.psce.com
• We are sharing WebScaleSQL packages with the MySQL Community!
• Check out http://www.psce.com/blog for details
• Follow @dbasquare to receive updates
01.02.2015 Follow us on Twitter @dbasquare 35
WebScaleSQL
What is WebScaleSQL?
WebScaleSQL is a collaboration among engineers from several companies
such as Facebook, Twitter, Google or Linkedin, that face the same challenges
in deploying MySQL at scale, and seek greater performance from a database
technology tailored for their needs.

More Related Content

What's hot

Performance Schema for MySQL Troubleshooting
 Performance Schema for MySQL Troubleshooting Performance Schema for MySQL Troubleshooting
Performance Schema for MySQL Troubleshooting
Sveta Smirnova
 
Troubleshooting MySQL Performance
Troubleshooting MySQL PerformanceTroubleshooting MySQL Performance
Troubleshooting MySQL Performance
Sveta Smirnova
 
Troubleshooting MySQL Performance
Troubleshooting MySQL PerformanceTroubleshooting MySQL Performance
Troubleshooting MySQL Performance
Sveta Smirnova
 
Full Table Scan: friend or foe
Full Table Scan: friend or foeFull Table Scan: friend or foe
Full Table Scan: friend or foe
Mauro Pagano
 
Basic MySQL Troubleshooting for Oracle DBAs
Basic MySQL Troubleshooting for Oracle DBAsBasic MySQL Troubleshooting for Oracle DBAs
Basic MySQL Troubleshooting for Oracle DBAs
Sveta Smirnova
 
SQL Tuning, takes 3 to tango
SQL Tuning, takes 3 to tangoSQL Tuning, takes 3 to tango
SQL Tuning, takes 3 to tango
Mauro Pagano
 
ANALYZE for executable statements - a new way to do optimizer troubleshooting...
ANALYZE for executable statements - a new way to do optimizer troubleshooting...ANALYZE for executable statements - a new way to do optimizer troubleshooting...
ANALYZE for executable statements - a new way to do optimizer troubleshooting...
Sergey Petrunya
 
Adapting to Adaptive Plans on 12c
Adapting to Adaptive Plans on 12cAdapting to Adaptive Plans on 12c
Adapting to Adaptive Plans on 12c
Mauro Pagano
 
Percona live-2012-optimizer-tuning
Percona live-2012-optimizer-tuningPercona live-2012-optimizer-tuning
Percona live-2012-optimizer-tuning
Sergey Petrunya
 
Histograms in 12c era
Histograms in 12c eraHistograms in 12c era
Histograms in 12c era
Mauro Pagano
 
Fosdem2012 mariadb-5.3-query-optimizer-r2
Fosdem2012 mariadb-5.3-query-optimizer-r2Fosdem2012 mariadb-5.3-query-optimizer-r2
Fosdem2012 mariadb-5.3-query-optimizer-r2
Sergey Petrunya
 
Chasing the optimizer
Chasing the optimizerChasing the optimizer
Chasing the optimizer
Mauro Pagano
 
UKOUG 2011: Practical MySQL Tuning
UKOUG 2011: Practical MySQL TuningUKOUG 2011: Practical MySQL Tuning
UKOUG 2011: Practical MySQL Tuning
FromDual GmbH
 
New features-in-mariadb-and-mysql-optimizers
New features-in-mariadb-and-mysql-optimizersNew features-in-mariadb-and-mysql-optimizers
New features-in-mariadb-and-mysql-optimizers
Sergey Petrunya
 
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
Sergey Petrunya
 
MariaDB 10.0 Query Optimizer
MariaDB 10.0 Query OptimizerMariaDB 10.0 Query Optimizer
MariaDB 10.0 Query Optimizer
Sergey Petrunya
 
Preparse Query Rewrite Plugins
Preparse Query Rewrite PluginsPreparse Query Rewrite Plugins
Preparse Query Rewrite Plugins
Sveta Smirnova
 
Introduction into MySQL Query Tuning for Dev[Op]s
Introduction into MySQL Query Tuning for Dev[Op]sIntroduction into MySQL Query Tuning for Dev[Op]s
Introduction into MySQL Query Tuning for Dev[Op]s
Sveta Smirnova
 
SQL Plan Directives explained
SQL Plan Directives explainedSQL Plan Directives explained
SQL Plan Directives explained
Mauro Pagano
 
MariaDB: Engine Independent Table Statistics, including histograms
MariaDB: Engine Independent Table Statistics, including histogramsMariaDB: Engine Independent Table Statistics, including histograms
MariaDB: Engine Independent Table Statistics, including histograms
Sergey Petrunya
 

What's hot (20)

Performance Schema for MySQL Troubleshooting
 Performance Schema for MySQL Troubleshooting Performance Schema for MySQL Troubleshooting
Performance Schema for MySQL Troubleshooting
 
Troubleshooting MySQL Performance
Troubleshooting MySQL PerformanceTroubleshooting MySQL Performance
Troubleshooting MySQL Performance
 
Troubleshooting MySQL Performance
Troubleshooting MySQL PerformanceTroubleshooting MySQL Performance
Troubleshooting MySQL Performance
 
Full Table Scan: friend or foe
Full Table Scan: friend or foeFull Table Scan: friend or foe
Full Table Scan: friend or foe
 
Basic MySQL Troubleshooting for Oracle DBAs
Basic MySQL Troubleshooting for Oracle DBAsBasic MySQL Troubleshooting for Oracle DBAs
Basic MySQL Troubleshooting for Oracle DBAs
 
SQL Tuning, takes 3 to tango
SQL Tuning, takes 3 to tangoSQL Tuning, takes 3 to tango
SQL Tuning, takes 3 to tango
 
ANALYZE for executable statements - a new way to do optimizer troubleshooting...
ANALYZE for executable statements - a new way to do optimizer troubleshooting...ANALYZE for executable statements - a new way to do optimizer troubleshooting...
ANALYZE for executable statements - a new way to do optimizer troubleshooting...
 
Adapting to Adaptive Plans on 12c
Adapting to Adaptive Plans on 12cAdapting to Adaptive Plans on 12c
Adapting to Adaptive Plans on 12c
 
Percona live-2012-optimizer-tuning
Percona live-2012-optimizer-tuningPercona live-2012-optimizer-tuning
Percona live-2012-optimizer-tuning
 
Histograms in 12c era
Histograms in 12c eraHistograms in 12c era
Histograms in 12c era
 
Fosdem2012 mariadb-5.3-query-optimizer-r2
Fosdem2012 mariadb-5.3-query-optimizer-r2Fosdem2012 mariadb-5.3-query-optimizer-r2
Fosdem2012 mariadb-5.3-query-optimizer-r2
 
Chasing the optimizer
Chasing the optimizerChasing the optimizer
Chasing the optimizer
 
UKOUG 2011: Practical MySQL Tuning
UKOUG 2011: Practical MySQL TuningUKOUG 2011: Practical MySQL Tuning
UKOUG 2011: Practical MySQL Tuning
 
New features-in-mariadb-and-mysql-optimizers
New features-in-mariadb-and-mysql-optimizersNew features-in-mariadb-and-mysql-optimizers
New features-in-mariadb-and-mysql-optimizers
 
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
 
MariaDB 10.0 Query Optimizer
MariaDB 10.0 Query OptimizerMariaDB 10.0 Query Optimizer
MariaDB 10.0 Query Optimizer
 
Preparse Query Rewrite Plugins
Preparse Query Rewrite PluginsPreparse Query Rewrite Plugins
Preparse Query Rewrite Plugins
 
Introduction into MySQL Query Tuning for Dev[Op]s
Introduction into MySQL Query Tuning for Dev[Op]sIntroduction into MySQL Query Tuning for Dev[Op]s
Introduction into MySQL Query Tuning for Dev[Op]s
 
SQL Plan Directives explained
SQL Plan Directives explainedSQL Plan Directives explained
SQL Plan Directives explained
 
MariaDB: Engine Independent Table Statistics, including histograms
MariaDB: Engine Independent Table Statistics, including histogramsMariaDB: Engine Independent Table Statistics, including histograms
MariaDB: Engine Independent Table Statistics, including histograms
 

Viewers also liked

Data encoding techniques for reducing energyb consumption in network on-chip
Data encoding techniques for reducing energyb consumption in network on-chipData encoding techniques for reducing energyb consumption in network on-chip
Data encoding techniques for reducing energyb consumption in network on-chip
LogicMindtech Nologies
 
Data encoding and Metadata for Streams
Data encoding and Metadata for StreamsData encoding and Metadata for Streams
Data encoding and Metadata for Streams
univalence
 
Data encoding
Data encodingData encoding
Data encoding
Akash Bhandari
 
CCNA
CCNACCNA
CCNA
niict
 
Encoding in Data Communication DC8
Encoding in Data Communication DC8Encoding in Data Communication DC8
Encoding in Data Communication DC8
koolkampus
 
Asynchronous and synchronous
Asynchronous and synchronousAsynchronous and synchronous
Asynchronous and synchronous
Akhil .B
 
Synchronous and-asynchronous-data-transfer
Synchronous and-asynchronous-data-transferSynchronous and-asynchronous-data-transfer
Synchronous and-asynchronous-data-transfer
Anuj Modi
 
Ccna Presentation
Ccna PresentationCcna Presentation
Ccna Presentation
bcdran
 
Data Encoding
Data EncodingData Encoding
Data Encoding
Luka M G
 

Viewers also liked (9)

Data encoding techniques for reducing energyb consumption in network on-chip
Data encoding techniques for reducing energyb consumption in network on-chipData encoding techniques for reducing energyb consumption in network on-chip
Data encoding techniques for reducing energyb consumption in network on-chip
 
Data encoding and Metadata for Streams
Data encoding and Metadata for StreamsData encoding and Metadata for Streams
Data encoding and Metadata for Streams
 
Data encoding
Data encodingData encoding
Data encoding
 
CCNA
CCNACCNA
CCNA
 
Encoding in Data Communication DC8
Encoding in Data Communication DC8Encoding in Data Communication DC8
Encoding in Data Communication DC8
 
Asynchronous and synchronous
Asynchronous and synchronousAsynchronous and synchronous
Asynchronous and synchronous
 
Synchronous and-asynchronous-data-transfer
Synchronous and-asynchronous-data-transferSynchronous and-asynchronous-data-transfer
Synchronous and-asynchronous-data-transfer
 
Ccna Presentation
Ccna PresentationCcna Presentation
Ccna Presentation
 
Data Encoding
Data EncodingData Encoding
Data Encoding
 

Similar to Character Encoding - MySQL DevRoom - FOSDEM 2015

MySQL SQL Tutorial
MySQL SQL TutorialMySQL SQL Tutorial
MySQL SQL Tutorial
Chien Chung Shen
 
Introduction To Lamp P2
Introduction To Lamp P2Introduction To Lamp P2
Introduction To Lamp P2
Amzad Hossain
 
MySQL Idiosyncrasies That Bite
MySQL Idiosyncrasies That BiteMySQL Idiosyncrasies That Bite
MySQL Idiosyncrasies That Bite
Ronald Bradford
 
MySQL Idiosyncrasies That Bite SF
MySQL Idiosyncrasies That Bite SFMySQL Idiosyncrasies That Bite SF
MySQL Idiosyncrasies That Bite SF
Ronald Bradford
 
Curso de MySQL 5.7
Curso de MySQL 5.7Curso de MySQL 5.7
Curso de MySQL 5.7
Eduardo Legatti
 
15 protips for mysql users pfz
15 protips for mysql users   pfz15 protips for mysql users   pfz
15 protips for mysql users pfz
Joshua Thijssen
 
OSMC 2008 | Monitoring MySQL by Geert Vanderkelen
OSMC 2008 | Monitoring MySQL by Geert VanderkelenOSMC 2008 | Monitoring MySQL by Geert Vanderkelen
OSMC 2008 | Monitoring MySQL by Geert Vanderkelen
NETWAYS
 
MySQL Idiosyncrasies That Bite 2010.07
MySQL Idiosyncrasies That Bite 2010.07MySQL Idiosyncrasies That Bite 2010.07
MySQL Idiosyncrasies That Bite 2010.07
Ronald Bradford
 
MySQLinsanity
MySQLinsanityMySQLinsanity
MySQLinsanity
Stanley Huang
 
Explain
ExplainExplain
How to Avoid Pitfalls in Schema Upgrade with Galera
How to Avoid Pitfalls in Schema Upgrade with GaleraHow to Avoid Pitfalls in Schema Upgrade with Galera
How to Avoid Pitfalls in Schema Upgrade with Galera
Sveta Smirnova
 
MySQL Kitchen : spice up your everyday SQL queries
MySQL Kitchen : spice up your everyday SQL queriesMySQL Kitchen : spice up your everyday SQL queries
MySQL Kitchen : spice up your everyday SQL queries
Damien Seguy
 
Big Data Analytics with MariaDB ColumnStore
Big Data Analytics with MariaDB ColumnStoreBig Data Analytics with MariaDB ColumnStore
Big Data Analytics with MariaDB ColumnStore
MariaDB plc
 
Mysql basics1
Mysql basics1Mysql basics1
Mysql basics1
Steffy Robert
 
My SQL Idiosyncrasies That Bite OTN
My SQL Idiosyncrasies That Bite OTNMy SQL Idiosyncrasies That Bite OTN
My SQL Idiosyncrasies That Bite OTN
Ronald Bradford
 
Bt0075, rdbms and my sql
Bt0075, rdbms and my sqlBt0075, rdbms and my sql
Bt0075, rdbms and my sql
smumbahelp
 
Percona Live 4/15/15: Transparent sharding database virtualization engine (DVE)
Percona Live 4/15/15: Transparent sharding database virtualization engine (DVE)Percona Live 4/15/15: Transparent sharding database virtualization engine (DVE)
Percona Live 4/15/15: Transparent sharding database virtualization engine (DVE)
Tesora
 
Applied Partitioning And Scaling Your Database System Presentation
Applied Partitioning And Scaling Your Database System PresentationApplied Partitioning And Scaling Your Database System Presentation
Applied Partitioning And Scaling Your Database System Presentation
Richard Crowley
 
MySQL 5.7 innodb_enhance_partii_20160527
MySQL 5.7 innodb_enhance_partii_20160527MySQL 5.7 innodb_enhance_partii_20160527
MySQL 5.7 innodb_enhance_partii_20160527
Saewoong Lee
 
My sql 5.7-upcoming-changes-v2
My sql 5.7-upcoming-changes-v2My sql 5.7-upcoming-changes-v2
My sql 5.7-upcoming-changes-v2
Morgan Tocker
 

Similar to Character Encoding - MySQL DevRoom - FOSDEM 2015 (20)

MySQL SQL Tutorial
MySQL SQL TutorialMySQL SQL Tutorial
MySQL SQL Tutorial
 
Introduction To Lamp P2
Introduction To Lamp P2Introduction To Lamp P2
Introduction To Lamp P2
 
MySQL Idiosyncrasies That Bite
MySQL Idiosyncrasies That BiteMySQL Idiosyncrasies That Bite
MySQL Idiosyncrasies That Bite
 
MySQL Idiosyncrasies That Bite SF
MySQL Idiosyncrasies That Bite SFMySQL Idiosyncrasies That Bite SF
MySQL Idiosyncrasies That Bite SF
 
Curso de MySQL 5.7
Curso de MySQL 5.7Curso de MySQL 5.7
Curso de MySQL 5.7
 
15 protips for mysql users pfz
15 protips for mysql users   pfz15 protips for mysql users   pfz
15 protips for mysql users pfz
 
OSMC 2008 | Monitoring MySQL by Geert Vanderkelen
OSMC 2008 | Monitoring MySQL by Geert VanderkelenOSMC 2008 | Monitoring MySQL by Geert Vanderkelen
OSMC 2008 | Monitoring MySQL by Geert Vanderkelen
 
MySQL Idiosyncrasies That Bite 2010.07
MySQL Idiosyncrasies That Bite 2010.07MySQL Idiosyncrasies That Bite 2010.07
MySQL Idiosyncrasies That Bite 2010.07
 
MySQLinsanity
MySQLinsanityMySQLinsanity
MySQLinsanity
 
Explain
ExplainExplain
Explain
 
How to Avoid Pitfalls in Schema Upgrade with Galera
How to Avoid Pitfalls in Schema Upgrade with GaleraHow to Avoid Pitfalls in Schema Upgrade with Galera
How to Avoid Pitfalls in Schema Upgrade with Galera
 
MySQL Kitchen : spice up your everyday SQL queries
MySQL Kitchen : spice up your everyday SQL queriesMySQL Kitchen : spice up your everyday SQL queries
MySQL Kitchen : spice up your everyday SQL queries
 
Big Data Analytics with MariaDB ColumnStore
Big Data Analytics with MariaDB ColumnStoreBig Data Analytics with MariaDB ColumnStore
Big Data Analytics with MariaDB ColumnStore
 
Mysql basics1
Mysql basics1Mysql basics1
Mysql basics1
 
My SQL Idiosyncrasies That Bite OTN
My SQL Idiosyncrasies That Bite OTNMy SQL Idiosyncrasies That Bite OTN
My SQL Idiosyncrasies That Bite OTN
 
Bt0075, rdbms and my sql
Bt0075, rdbms and my sqlBt0075, rdbms and my sql
Bt0075, rdbms and my sql
 
Percona Live 4/15/15: Transparent sharding database virtualization engine (DVE)
Percona Live 4/15/15: Transparent sharding database virtualization engine (DVE)Percona Live 4/15/15: Transparent sharding database virtualization engine (DVE)
Percona Live 4/15/15: Transparent sharding database virtualization engine (DVE)
 
Applied Partitioning And Scaling Your Database System Presentation
Applied Partitioning And Scaling Your Database System PresentationApplied Partitioning And Scaling Your Database System Presentation
Applied Partitioning And Scaling Your Database System Presentation
 
MySQL 5.7 innodb_enhance_partii_20160527
MySQL 5.7 innodb_enhance_partii_20160527MySQL 5.7 innodb_enhance_partii_20160527
MySQL 5.7 innodb_enhance_partii_20160527
 
My sql 5.7-upcoming-changes-v2
My sql 5.7-upcoming-changes-v2My sql 5.7-upcoming-changes-v2
My sql 5.7-upcoming-changes-v2
 

Recently uploaded

DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSISDECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
Tier1 app
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Boost Your Savings with These Money Management Apps
Boost Your Savings with These Money Management AppsBoost Your Savings with These Money Management Apps
Boost Your Savings with These Money Management Apps
Jhone kinadey
 
Computer Science & Engineering VI Sem- New Syllabus.pdf
Computer Science & Engineering VI Sem- New Syllabus.pdfComputer Science & Engineering VI Sem- New Syllabus.pdf
Computer Science & Engineering VI Sem- New Syllabus.pdf
chandangoswami40933
 
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptxOperational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
sandeepmenon62
 
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdfBaha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid
 
All you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVMAll you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVM
Alina Yurenko
 
Orca: Nocode Graphical Editor for Container Orchestration
Orca: Nocode Graphical Editor for Container OrchestrationOrca: Nocode Graphical Editor for Container Orchestration
Orca: Nocode Graphical Editor for Container Orchestration
Pedro J. Molina
 
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
kalichargn70th171
 
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
kgyxske
 
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
Bert Jan Schrijver
 
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
dakas1
 
Streamlining End-to-End Testing Automation
Streamlining End-to-End Testing AutomationStreamlining End-to-End Testing Automation
Streamlining End-to-End Testing Automation
Anand Bagmar
 
The Power of Visual Regression Testing_ Why It Is Critical for Enterprise App...
The Power of Visual Regression Testing_ Why It Is Critical for Enterprise App...The Power of Visual Regression Testing_ Why It Is Critical for Enterprise App...
The Power of Visual Regression Testing_ Why It Is Critical for Enterprise App...
kalichargn70th171
 
Penify - Let AI do the Documentation, you write the Code.
Penify - Let AI do the Documentation, you write the Code.Penify - Let AI do the Documentation, you write the Code.
Penify - Let AI do the Documentation, you write the Code.
KrishnaveniMohan1
 
The Comprehensive Guide to Validating Audio-Visual Performances.pdf
The Comprehensive Guide to Validating Audio-Visual Performances.pdfThe Comprehensive Guide to Validating Audio-Visual Performances.pdf
The Comprehensive Guide to Validating Audio-Visual Performances.pdf
kalichargn70th171
 
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Paul Brebner
 
WMF 2024 - Unlocking the Future of Data Powering Next-Gen AI with Vector Data...
WMF 2024 - Unlocking the Future of Data Powering Next-Gen AI with Vector Data...WMF 2024 - Unlocking the Future of Data Powering Next-Gen AI with Vector Data...
WMF 2024 - Unlocking the Future of Data Powering Next-Gen AI with Vector Data...
Luigi Fugaro
 
Beginner's Guide to Observability@Devoxx PL 2024
Beginner's  Guide to Observability@Devoxx PL 2024Beginner's  Guide to Observability@Devoxx PL 2024
Beginner's Guide to Observability@Devoxx PL 2024
michniczscribd
 

Recently uploaded (20)

DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSISDECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
Boost Your Savings with These Money Management Apps
Boost Your Savings with These Money Management AppsBoost Your Savings with These Money Management Apps
Boost Your Savings with These Money Management Apps
 
Computer Science & Engineering VI Sem- New Syllabus.pdf
Computer Science & Engineering VI Sem- New Syllabus.pdfComputer Science & Engineering VI Sem- New Syllabus.pdf
Computer Science & Engineering VI Sem- New Syllabus.pdf
 
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptxOperational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
Operational ease MuleSoft and Salesforce Service Cloud Solution v1.0.pptx
 
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdfBaha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
 
All you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVMAll you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVM
 
Orca: Nocode Graphical Editor for Container Orchestration
Orca: Nocode Graphical Editor for Container OrchestrationOrca: Nocode Graphical Editor for Container Orchestration
Orca: Nocode Graphical Editor for Container Orchestration
 
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
 
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
 
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
 
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
 
Streamlining End-to-End Testing Automation
Streamlining End-to-End Testing AutomationStreamlining End-to-End Testing Automation
Streamlining End-to-End Testing Automation
 
The Power of Visual Regression Testing_ Why It Is Critical for Enterprise App...
The Power of Visual Regression Testing_ Why It Is Critical for Enterprise App...The Power of Visual Regression Testing_ Why It Is Critical for Enterprise App...
The Power of Visual Regression Testing_ Why It Is Critical for Enterprise App...
 
bgiolcb
bgiolcbbgiolcb
bgiolcb
 
Penify - Let AI do the Documentation, you write the Code.
Penify - Let AI do the Documentation, you write the Code.Penify - Let AI do the Documentation, you write the Code.
Penify - Let AI do the Documentation, you write the Code.
 
The Comprehensive Guide to Validating Audio-Visual Performances.pdf
The Comprehensive Guide to Validating Audio-Visual Performances.pdfThe Comprehensive Guide to Validating Audio-Visual Performances.pdf
The Comprehensive Guide to Validating Audio-Visual Performances.pdf
 
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
 
WMF 2024 - Unlocking the Future of Data Powering Next-Gen AI with Vector Data...
WMF 2024 - Unlocking the Future of Data Powering Next-Gen AI with Vector Data...WMF 2024 - Unlocking the Future of Data Powering Next-Gen AI with Vector Data...
WMF 2024 - Unlocking the Future of Data Powering Next-Gen AI with Vector Data...
 
Beginner's Guide to Observability@Devoxx PL 2024
Beginner's  Guide to Observability@Devoxx PL 2024Beginner's  Guide to Observability@Devoxx PL 2024
Beginner's Guide to Observability@Devoxx PL 2024
 

Character Encoding - MySQL DevRoom - FOSDEM 2015

  • 1. Character encoding Breaking and unbreaking your data Maciej Dobrzanski maciek@psce.com | @mushupl Brussels, 1 Feb 2015 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 2. Character Encoding • Binary representation of glyphs • Each character can be represented by 1 or more bytes • Popular schemes • ASCII • Unicode • UTF-8, UTF-16, UTF-32 • Language specific character sets • US (Latin US) • Europe (Latin 1, Latin 2) • Asia (EUC-KR, GB18030) 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 3. Character Encoding • Character set defines the visual interpretation of binary information • One glyph can be associated with several numeric codes • One numeric code may be used to represent several different glyphs 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 4. Please state the nature of the emergency • Application configuration • Database configuration • Table/column definitions 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 5. Problem #1: We are all born Swedish • MySQL uses latin1 by default • MySQL 5.7 too • Is anyone actually aware of that? • Why Swedish? • latin1_swedish_ci is the default collation 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 6. Problem #1 • Let’s build an application mysql> SELECT @@global.character_set_server, @@session.character_set_client; +-------------------------------+--------------------------------+ | @@global.character_set_server | @@session.character_set_client | +-------------------------------+--------------------------------+ | latin1 | latin1 | +-------------------------------+--------------------------------+ 1 row in set (0.00 sec) mysql> CREATE SCHEMA fosdem; Query OK, 1 row affected (0.00 sec) mysql> USE fosdem; mysql> CREATE TABLE locations (city VARCHAR(30) NOT NULL); Query OK, 0 rows affected (0.15 sec) mysql> SHOW CREATE TABLE locationsG *************************** 1. row *************************** Table: locations Create Table: CREATE TABLE `locations` ( `city` varchar(30) NOT NULL ) ENGINE=InnoDB DEFAULT CHARSET=latin1 1 row in set (0.00 sec) 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 7. Problem #1 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 8. Problem #1 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 9. Problem #1 • Everything is correct… NOT! mysql> SET NAMES utf8; Query OK, 0 rows affected (0.00 sec) mysql> select * from locations; +--------------------+ | city | +--------------------+ | Berlin | | Kraków | | 東京都 | +--------------------+ 3 rows in set (0.00 sec) mysql> SET NAMES latin1; Query OK, 0 rows affected (0.00 sec) mysql> select * from locations; +-----------+ | city | +-----------+ | Berlin | | Kraków | | 東京都 | +-----------+ 3 rows in set (0.00 sec) 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 10. Problem #1 • Let’s fix this • Or can we ignore it? • Ruby may not like it # grep character-set-server /etc/mysql/my.cnf character-set-server = utf8 mysql> SELECT @@global.character_set_server, @@session.character_set_client; +-------------------------------+--------------------------------+ | @@global.character_set_server | @@session.character_set_client | +-------------------------------+--------------------------------+ | utf8 | utf8 | +-------------------------------+--------------------------------+ 1 row in set (0.00 sec) ...we are fixing our tables here... mysql> SHOW CREATE TABLE locationsG *************************** 1. row *************************** Table: locations Create Table: CREATE TABLE `locations` ( `city` varchar(30) NOT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8 1 row in set (0.00 sec) 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 11. Problem #1: The good news • It’s usually fixable 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 12. Problem #2: Settings, defaults, inheritance • Where do you set character sets in MySQL? • Sesssion settings • character_set_server • character_set_client • character_set_connection • character_set_database • character_set_result • Schema level defaults • Table level defaults • Column charsets 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 13. Problem #2 • Having fixed our problem #1, we continue to develop our application mysql> SELECT @@session.character_set_server, @@session.character_set_client; +--------------------------------+--------------------------------+ | @@session.character_set_server | @@session.character_set_client | +--------------------------------+--------------------------------+ | utf8 | utf8 | +--------------------------------+--------------------------------+ 1 row in set (0.00 sec) mysql> USE fosdem; mysql> CREATE TABLE people (first_name VARCHAR(30) NOT NULL, last_name VARCHAR(30) NOT NULL); Query OK, 0 rows affected (0.13 sec) 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 14. Problem #2 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 15. Problem #2 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 16. Problem #2 • Why is the table character set latin1? mysql> SELECT @@session.character_set_server, @@session.character_set_client; +--------------------------------+--------------------------------+ | @@session.character_set_server | @@session.character_set_client | +--------------------------------+--------------------------------+ | utf8 | utf8 | +--------------------------------+--------------------------------+ 1 row in set (0.00 sec) mysql> USE fosdem; mysql> SHOW CREATE TABLE peopleG *************************** 1. row *************************** Table: people Create Table: CREATE TABLE `people` ( `first_name` varchar(30) NOT NULL, `last_name` varchar(30) NOT NULL ) ENGINE=InnoDB DEFAULT CHARSET=latin1 1 row in set (0.00 sec) 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 17. Problem #2 • What’s all this, then? mysql> SHOW SESSION VARIABLES LIKE 'character_set_%'; +--------------------------+----------------------------+ | Variable_name | Value | +--------------------------+----------------------------+ | character_set_client | utf8 | | character_set_connection | utf8 | | character_set_database | latin1 | | character_set_filesystem | binary | | character_set_results | utf8 | | character_set_server | utf8 | | character_set_system | utf8 | | character_sets_dir | /usr/share/mysql/charsets/ | +--------------------------+----------------------------+ 8 rows in set (0.00 sec) mysql> SHOW CREATE DATABASE fosdemG *************************** 1. row *************************** Database: fosdem Create Database: CREATE DATABASE `fosdem` /*!40100 DEFAULT CHARACTER SET latin1 */ 1 row in set (0.00 sec) 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 18. Problem #2 • Can we fix this? mysql> SET NAMES utf8; Query OK, 0 rows affected (0.00 sec) mysql> SELECT last_name, HEX(last_name) FROM people; +------------+----------------------+ | last_name | HEX(last_name) | +------------+----------------------+ | Lemon | 4C656D6F6E | | Müller | 4DFC6C6C6572 | | Dobrza?ski | 446F62727A613F736B69 | +------------+----------------------+ 3 rows in set (0.00 sec) mysql> SET NAMES latin2; Query OK, 0 rows affected (0.00 sec) mysql> SELECT last_name, HEX(last_name) FROM people; +------------+----------------------+ | last_name | HEX(last_name) | +------------+----------------------+ | Lemon | 4C656D6F6E | | Müller | 4DFC6C6C6572 | | Dobrza?ski | 446F62727A613F736B69 | +------------+----------------------+ 3 rows in set (0.00 sec) • We can’t! :-( • 0x3F is '?', so my 'ń' was lost 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 19. Problem #2: The bad news • It may not be enough to configure the server correctly • A mismatch between client and server can permantenly break data • Implicit conversion inside MySQL server 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 20. Problem #2: Settings, defaults, inheritance • Where do you set character sets in MySQL? • Sesssion settings • character_set_server • character_set_client • character_set_connection • character_set_database • character_set_result • Schema level defaults – affect new tables • Table level defaults – affect new columns • Column charsets 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 21. 01.02.2015 Follow us on Twitter @dbasquare www.psce.com Problem #2: Settings, defaults, inheritance master [localhost] {msandbox} ((none)) > SELECT @@global.character_set_server, @@session.character_set_client; +-------------------------------+--------------------------------+ | @@global.character_set_server | @@session.character_set_client | +-------------------------------+--------------------------------+ | latin1 | utf8 | +-------------------------------+--------------------------------+ 1 row in set (0.00 sec) master [localhost] {msandbox} ((none)) > CREATE SCHEMA fosdemG Query OK, 1 row affected (0.00 sec) master [localhost] {msandbox} ((none)) > SHOW CREATE SCHEMA fosdemG *************************** 1. row *************************** Database: fosdem Create Database: CREATE DATABASE `fosdem` /*!40100 DEFAULT CHARACTER SET latin1 */ 1 row in set (0.00 sec)
  • 22. 01.02.2015 Follow us on Twitter @dbasquare www.psce.com Problem #2: Settings, defaults, inheritance master [localhost] {msandbox} ((none)) > USE fosdem; Database changed master [localhost] {msandbox} (fosdem) > CREATE TABLE test (a VARCHAR(300), INDEX (a)); Query OK, 0 rows affected (0.62 sec) master [localhost] {msandbox} (fosdem) > SHOW CREATE TABLE testG *************************** 1. row *************************** Table: test Create Table: CREATE TABLE `test` ( `a` varchar(300) DEFAULT NULL, KEY `a` (`a`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 1 row in set (0.00 sec)
  • 23. 01.02.2015 Follow us on Twitter @dbasquare www.psce.com Problem #2: Settings, defaults, inheritance master [localhost] {msandbox} (fosdem) > ALTER TABLE test DEFAULT CHARSET = utf8; Query OK, 0 rows affected (0.08 sec) Records: 0 Duplicates: 0 Warnings: 0 master [localhost] {msandbox} (fosdem) > SHOW CREATE TABLE testG *************************** 1. row *************************** Table: test Create Table: CREATE TABLE `test` ( `a` varchar(300) CHARACTER SET latin1 DEFAULT NULL, KEY `a` (`a`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 1 row in set (0.00 sec)
  • 24. 01.02.2015 Follow us on Twitter @dbasquare www.psce.com Problem #2: Settings, defaults, inheritance master [localhost] {msandbox} (fosdem) > ALTER TABLE test ADD b VARCHAR(10); Query OK, 0 rows affected (0.74 sec) Records: 0 Duplicates: 0 Warnings: 0 master [localhost] {msandbox} (fosdem) > SHOW CREATE TABLE testG *************************** 1. row *************************** Table: test Create Table: CREATE TABLE `test` ( `a` varchar(300) CHARACTER SET latin1 DEFAULT NULL, `b` varchar(10) DEFAULT NULL, KEY `a` (`a`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 1 row in set (0.00 sec)
  • 25. I f**ckd up. What do I do? • Let’s start with what you shouldn’t do • Keep calm and don’t start by changing something • Analyze the situation • Why did the problem occur in the first place? • Reassess the damage • Is it consistent? • Are all rows broken in the same way? • Are some rows bad, but others are okay? • Are all bad in several different ways? • Is it actually repearable? • No character mapping occurred during writes (e.g. unicode over latin1/latin1) 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 26. I f**ckd up. What else I shouldn’t do, then? • Do not rush things as you may easily go from bad to worse • Do not start fixing this on a replication slave • You can’t fix this by fixing tables one by one on a live database • Unless you really have everything in one table • Do not use: ALTER TABLE … DEFAULT CHARSET = … • It only changes the default character set for new columns • Do not use: ALTER TABLE … CONVERT TO CHARACTER SET … • It’s not for fixing broken encoding • Do not use: ALTER TABLE … MODIFY col_name … CHARACTER SET … 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 27. I f**ckd up. So how do I fix it? • What needs to be fixed? • Schema defaut character set • ALTER SCHEMA fosdem DEFAULT CHARSET = utf8 • Tables with text columns: CHAR, VARCHAR, TEXT, TINYTEXT, LONGTEXT • What about ENUM? • Use INFORMATION_SCHEMA to grab a list • What about other tables? • They too (eventually), but it’s not critical SELECT CONCAT(c.table_schema, '.', c.table_name) AS candidate_table FROM information_schema.columns c WHERE c.table_schema = 'fosdem' AND c.column_type REGEXP '^(.*CHAR|.*TEXT|ENUM)((.+))?$' GROUP BY candidate_table; 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 28. I f**ckd up. So how do I fix it? • Option 1 – requires downtime • Dump and restore • Dump the data preserving the bad configuration and drop the old database bash# mysqldump -u root -p --skip-set-charset --default-character-set=latin1 fosdem > fosdem.sql mysql> DROP SCHEMA fosdem; • Correct table definitions in the dump file • Edit DEFAULT CHARSET in all CREATE TABLE statements • Create the database again and import the data back mysql> CREATE SCHEMA fosdem DEFAULT CHARSET utf8; bash# mysql -u root -p --default-character-set=utf8 fosdem < fosdem.sql 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 29. I f**ckd up. So how do I fix it? • Option 2 – requires downtime • Perform a two step conversion with ALTER TABLE • Original encoding -> VARBINARY/BLOB -> Target encoding • Conversion from/to BINARY/BLOB removes character set context • How? • Stop applications • On each tabe, for each text column perform: ALTER TABLE tbl MODIFY col_name VARBINARY(255); ALTER TABLE tbl MODIFY col_name VARCHAR(255) CHARACTER SET utf8; • You may specify multiple columns per ALTER TABLE • Fix the problems (application and/or db configs) • Restart applications 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 30. I f**ckd up. So how do I fix it? • Option 3 – online character set fix; no downtime* • Thanks to our plugin for pt-online-schema-change • and a tiny patch for pt-online-schema-change that goes with the plugin  • How? • Start pt-online-schema-change on all tables – one by one • Do not rotate tables (--no-swap-tables) or drop pt-osc triggers • Wait until all tables have been converted • Stop applications • Fix the problems (application and/or db configs) • Rotate tables – takes just 1 minute • Restart applications • Et voilà 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 31. GOTCHAs! • Data space requrements may change during conversion • Latin1 uses 1 byte per character, utf8 will need to assume 3 bytes • VARCHAR/TEXT fit up to 64KB – it won’t fit 65536 multi-byte characters • Key length limit is 767 bytes • Data type and/or index length changes may be required • Test and plan this ahead • There may be more prolems than you think • Detect irrecoverible problems with a simple stored procedure 01.02.2015 Follow us on Twitter @dbasquare www.psce.com CREATE FUNCTION `cnv_test_conversion` (`value_before` LONGTEXT, `value_after` LONGTEXT) RETURNS tinyint(1) BEGIN RETURN (IFNULL(CONVERT(CONVERT(`value_before` USING latin1) USING binary), "") = IFNULL(CONVERT(`value_after` USING binary), "")); END;;
  • 32. 01.02.2015 Follow us on Twitter @dbasquare Need help? Visit www.psce.com GOTCHAs! master [localhost] {msandbox} (fosdem) > ALTER TABLE test MODIFY a VARCHAR(300) CHARACTER SET utf8; Query OK, 0 rows affected, 1 warning (1.23 sec) Records: 0 Duplicates: 0 Warnings: 1 master [localhost] {msandbox} (fosdem) > SHOW WARNINGSG *************************** 1. row *************************** Level: Warning Code: 1071 Message: Specified key was too long; max key length is 767 bytes 1 row in set (0.00 sec) master [localhost] {msandbox} (fosdem) > SHOW CREATE TABLE testG *************************** 1. row *************************** Table: test Create Table: CREATE TABLE `test` ( `a` varchar(300) DEFAULT NULL, `b` varchar(10) DEFAULT NULL, KEY `a` (`a`(255)) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 1 row in set (0.00 sec)
  • 33. How to do it right? • Set character-set-server during initial configuration • When creating new schemas, always specify the desired charset • CREATE SCHEMA fosdem DEFAULT CHARSET = utf8 • ALTER SCHEMA fosdem DEFAULT CHARSET = utf8 • When creating new tables, also explicitly specify the charset • CREATE TABLE people (…) DEFAULT CHARSET = utf8 • And don’t forget to configure applications too • You can try to force charset on the clients • init-connect = "SET NAMES utf8" • It might also break applications that don’t want to talk to MySQL using utf8 01.02.2015 Follow us on Twitter @dbasquare www.psce.com
  • 34. Oh, and one more thing… 01.02.2015 Follow us on Twitter @dbasquare Need help? Visit www.psce.com
  • 35. • We are sharing WebScaleSQL packages with the MySQL Community! • Check out http://www.psce.com/blog for details • Follow @dbasquare to receive updates 01.02.2015 Follow us on Twitter @dbasquare 35 WebScaleSQL What is WebScaleSQL? WebScaleSQL is a collaboration among engineers from several companies such as Facebook, Twitter, Google or Linkedin, that face the same challenges in deploying MySQL at scale, and seek greater performance from a database technology tailored for their needs.