SlideShare a Scribd company logo
1 of 58
Download to read offline
MySQL with🔔&🎵
@thomas_shone
🪠 How it started
󰱖People
Xin Wu
Simon Mudd
Daniël van Eeden
Mattias Jonsson
Kristian Köhntopp (@isotopp)
📦 Cargo Culting
⚙ Inner Workings
🛠 Cool Tools
👶 My First Table
CREATE TABLE users (
email VARCHAR(255) NOT NULL UNIQUE,
name VARCHAR(300),
password VARCHAR(72),
address TEXT,
country_iso VARCHAR(2),
language_iso VARCHAR(2),
province VARCHAR(64),
joined DATETIME
);
PRIMARY KEY TIMESTAMP DATETIME VARCHAR vs CHAR
INDEX Singular vs Plural Language vs Locale TEXT
Splitting DEFAULTs NULLiness Email length
ENUMs Bcrypt Length Assumptions Normalization
🎲 Bingo Card
bob.smith+tag1+tag2@gmail.com === bobsmith@gmail.com
bob.smith@gmail.com === bobsmith@gmail.com
👶 My First Table
CREATE TABLE users (
email VARCHAR(255) NOT NULL UNIQUE,
name VARCHAR(255),
password VARCHAR(72),
address TEXT,
country_iso VARCHAR(2),
language_iso VARCHAR(2),
province VARCHAR(64),
joined DATETIME
);
What engine is used?
What is the primary key?
⚙MyISAM vs InnoDB
Transactions
Row level locking
ACID compliant
Faster reads
SELECT TABLE_SCHEMA, TABLE_NAME
FROM information_schema.TABLES
WHERE ENGINE = 'myISAM'
👶 My First Table
CREATE TABLE users (
email VARCHAR(255) NOT NULL UNIQUE,
name VARCHAR(255),
password VARCHAR(72),
address TEXT,
country_iso VARCHAR(2),
language_iso VARCHAR(2),
province VARCHAR(64),
joined DATETIME
) ENGINE=InnoDB;
What engine is used?
What is a clustered index?
What happens without a
NOT NULL UNIQUE?
Resource: https://blog.jcole.us/2013/05/02/how-does-innodb-behave-without-a-primary-key/
What is the primary key?
SELECT database_name, table_name
FROM mysql.innodb_index_stats
WHERE index_name = 'GEN_CLUST_INDEX'
So is a 🔑 PRIMARY KEY needed?
󰰁Relationships
CREATE TABLE users (
email VARCHAR(255) NOT NULL UNIQUE,
…
);
CREATE TABLE posts (
email VARCHAR(255),
title VARCHAR(255) NOT NULL UNIQUE,
slug VARCHAR(255) NOT NULL UNIQUE,
content TEXT,
status ENUM('draft', 'published')
);
SET_VAR(sql_require_primary_key=ON)
Resource: https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_sql_require_primary_key
What to use as a PRIMARY KEY?
● INT UNSIGNED NOT NULL AUTO_INCREMENT
● UUID as BINARY(16)
● TEXT
📅 Dates, Times & Timezones
HERE BE DRAGONS!
Formats
● DATE
○ Min: 1000-01-01
○ Max: 9999-12-31
● DATETIME
○ Min: 1000-01-01 00:00:00
○ Max: 9999-12-31 23:59:59
● TIMESTAMP
○ Min: 1970-01-01 00:00:01
○ Max: 2038-01-19 03:14:07
● TIME
○ Min: -838:59:59
○ Max: 838:59:59
🕑 UTC vs Timezone?
Resource: https://dev.mysql.com/doc/refman/8.0/en/datetime.html
SELECT @@GLOBAL.time_zone,
@@SESSION.time_zone;
SET GLOBAL time_zone = timezone;
SET time_zone = timezone;
Resource: https://dev.mysql.com/doc/refman/8.0/en/time-zone-support.html
🔎 Performant Search
Physical Storage of InnoDB
Physical Storage of InnoDB Index
🧮 1,048,576 rows only needs a tree depth of 10
Physical Storage of InnoDB Index
📇 Identifying Index Needs
Is there a need?
● User complaints?
● Personal experience and frustrations?
● Resource hogging?
● Slow running processes
● Treat optimizing queries like you’d optimize anything
> EXPLAIN SELECT * FROM users WHERE name = "Bob Smith";
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-----------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-----------------------------+
| 1 | SIMPLE | users | NULL | ALL | NULL | NULL | NULL | NULL | 1582 | 1.0 | Using where; Using filesort |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-----------------------------+
> EXPLAIN SELECT id, email FROM users WHERE email = "bob@smith.com";
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-------------+
| 1 | SIMPLE | users | NULL | INDEX | email | email | 256 | NULL | 1 | 100.0 | Using index |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-------------+
> EXPLAIN SELECT * FROM users WHERE id = 123456;
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| 1 | PRIMARY | users | NULL | CONST | PRIMARY | PRIMARY | 4 | NULL | 1 | 100.0 | NULL |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
EXPLAIN your queries
Resource: https://dev.mysql.com/doc/refman/5.7/en/explain-output.html
# Hints to use one of the following indexes
SELECT * FROM table USE INDEX (index_name,...) WHERE ...;
# Same as above but assumes a full table scan is very expensive
SELECT * FROM table FORCE INDEX (index_name,...) WHERE ...;
# Ensures that certain indexes aren’t being used
SELECT * FROM table IGNORE INDEX (index_name,...) WHERE ...;
Resource: https://dev.mysql.com/doc/refman/8.0/en/index-hints.html
Use the FORCE INDEX
SHOW [GLOBAL | SESSION] VARIABLES
[LIKE 'pattern' | WHERE expr]
SET variable = expr [, variable = expr] ...
> SET slow_query_log = 1;
> SET long_query_time = 10; # Default
> SET min_examined_row_limit = 100;
> SET log_queries_not_using_indexes = 'ON';
> SET slow_query_log_file = 'host_name-slow.log'; # Default
Enable Logging
Resource: https://dev.mysql.com/doc/refman/5.6/en/slow-query-log.html
$ vim /etc/mysql/conf.d/mysql.cnf # depends on distro
[mysqld]
performance_schema=ON
$ service mysqld restart # depends on distro too!
$ mysql
> SHOW VARIABLES WHERE Variable_name = 'performance_schema';
+--------------------+-------+
| Variable_name | Value |
+--------------------+-------+
| performance_schema | ON |
+--------------------+-------+
Performance Schema
Resource: https://dev.mysql.com/doc/refman/5.7/en/performance-schema.html
> SELECT DIGEST_TEXT, SUM_SELECT_SCAN
FROM performance_schema.events_statements_summary_by_digest
WHERE SUM_SELECT_SCAN > 0 AND SCHEMA_NAME = [schema] ORDER BY SUM_SELECT_SCAN DESC;
+----------------------------------------------+-----------------+
| DIGEST_TEXT | SUM_SELECT_SCAN |
+----------------------------------------------+-----------------+
| SELECT `*` FROM `users` WHERE ( `name` = ? ) | 700 |
+----------------------------------------------+-----------------+
> SELECT DIGEST_TEXT, SUM_ROWS_EXAMINED / SUM_ROWS_SENT AS RATIO
FROM performance_schema.events_statements_summary_by_digest
WHERE SCHEMA_NAME = [schema] ORDER BY RATIO DESC;
+---------------------------------------------------------------+-------+
| DIGEST_TEXT | RATIO |
+---------------------------------------------------------------+-------+
| SELECT `*` FROM `users` WHERE ( `name` = ? AND `joined` > ? ) | 10730 |
+---------------------------------------------------------------+-------+
events_statements_summary_by_digest
Resource: https://dev.mysql.com/doc/refman/5.7/en/performance-schema.html
🏗 Index Creation Options
Creation
● Combining columns
○ INDEX idx_columnA_columnB (columnA, columnB)
● Truncating columns
○ INDEX idx_columnC (columnC(72))
● Primary key considerations
○ Primary key is post-pended on to all indexes.
○ The size of your primary key impacts your index size.
CREATE TABLE users (
id UNSIGNED NOT NULL AUTO_INCREMENT,
email VARCHAR(255),
name VARCHAR(255),
INDEX idx_email_name (email, name),
INDEX idx_name_email (name, email),
PRIMARY KEY (id)
);
SELECT * TABLE users WHERE email='bob@smith.com';
Which index will be used?
Creation
CREATE TABLE users (
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
name VARCHAR(255),
email VARCHAR(255) NOT NULL UNIQUE,
password VARCHAR(72),
address TEXT,
country_iso VARCHAR(2),
province VARCHAR(64)
INDEX idx_email (email),
INDEX idx_name (name),
INDEX idx_country_iso (country_iso),
INDEX idx_province (province),
PRIMARY KEY (id)
);
Original
CREATE TABLE users (
email VARCHAR(255) NOT NULL UNIQUE,
name VARCHAR(255),
password VARCHAR(72),
address TEXT,
country_iso VARCHAR(2),
language_iso VARCHAR(2),
province VARCHAR(64),
joined DATETIME
);
Second Attempt
CREATE TABLE users (
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
name VARCHAR(255),
email VARCHAR(255) NOT NULL UNIQUE,
password VARCHAR(72),
address TEXT,
country_iso VARCHAR(2),
province VARCHAR(64)
UNIQUE idx_email (email),
INDEX idx_name (name(16)),
INDEX idx_country_iso_province (country_iso, province(16)),
PRIMARY KEY (id)
);
🧮 How are Indexes are Selected?
Query to Index Selection
1. MySQL will use only ONE index for the WHERE statement
a. Have you hinted/forced an index?
b. Do you have an index that matches the WHERE statement?
c. Do you have one (or more) index(es) that matches part of the WHERE statement
i. We can use left most part of multiple column indexes.
ii. Index with smallest row result is picked.
2. ORDER/GROUP BY uses available indexes of the LEFT MOST
column
GROUP BY columnA, columnB ORDER BY columnC, columnD
Resource: https://dev.mysql.com/doc/refman/8.0/en/mysql-indexes.html
📈Evaluating an Index
Evaluating Indexes
● Same tools as the Identification process
● Is the index redundant?
SELECT table_name, redundant_index_name, dominant_index_name
FROM sys.schema_redundant_indexes
WHERE table_schema = [schema]
● Is the index even used?
SELECT object_schema, object_name, index_name
FROM performance_schema.table_io_waits_summary_by_index_usage
WHERE index_name IS NOT NULL
AND index_name != 'PRIMARY'
AND count_star = 0
AND object_schema = [schema]
󰰁Relationships
Relationships
CREATE TABLE users (
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
…
);
CREATE TABLE posts (
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
author_user_id INT UNSIGNED NOT NULL,
…
FOREIGN KEY (author_user_id)
REFERENCES users(id)
ON DELETE CASCADE
);
🔑FOREIGN KEY or not 💀FOREIGN KEY?
Using foreign keys
Cons
● Locks
● Performance
● Distributed systems (multiple databases)
● 15 year old bug (foreign keys and triggers)
● Logic in database not code
Pros
● Data consistency on INSERT and DELETE
● Enforced type constraints between fields
⚖Scaling
Cargo Culting Text
column VARCHAR(255) NOT NULL DEFAULT ''
Data Types
Size of Field = header + body
VARCHAR(64) = 1 byte header (to store 64)
+ up to 64 characters
1 character = 1-4 bytes (depending on encoding)
-------------------------------------------------
Minimum = 1 byte
Maximum = 257 bytes (using utf8mb4)
Resource: https://dev.mysql.com/doc/refman/8.0/en/storage-requirements.html
🛠 Cool Tools
🪑 MySQL Workbench
Resource: https://www.mysql.com/products/workbench/
🗃 mysqldump
Resource: https://dev.mysql.com/doc/refman/8.0/en/mysqldump.html
$ vim backup_database.sh
# Quick and dirty MySQL backup solution
DATETIME=$(date +%Y%m%d-%H%M%S)
# Which database to backup automatically
DATABASE=db_name
# Use file based password handling
mysqldump [options] $(DATABASE) | tar cvzf > db_name.$(DATETIME).sql.tar.gz
$ crontab -e
# Every day at 4am
0 4 * * * backup_database.sh
🕵 Cadfael
Resource: https://github.com/xsist10/cadfael
$ composer global require cadfael/cadfael # or get the phar
$ ./vendor/bin/cadfael run
--host [host] # Where is the database host?
--username [username] # MySQL username
--performance_schema # Run performance schema checks
[schema1] [...schema2] # Which schemas to check?
+----------------------+-------------------------------+---------+--------------------------------------------------------------------------------------+
| Check | Entity | Status | Message |
+----------------------+-------------------------------+---------+--------------------------------------------------------------------------------------+
| SaneInnoDbPrimaryKey | table_with_insane_primary_key | Warning | In InnoDB tables, the PRIMARY KEY is appended to other indexes. |
| | | | If the PRIMARY KEY is big, other indexes will use more space. |
| | | | Maybe turn your PRIMARY KEY into UNIQUE and add an auto_increment PRIMARY KEY. |
| | | | Reference: https://dev.mysql.com/doc/refman/5.7/en/innodb-index-types.html |
| EmptyTable | empty_table | Warning | Table contains no records. |
| RedundantIndexes | table_with_insane_primary_key | Concern | Redundant index `full_name` (superseded by `full_name_height_in_cm`). |
| | | | A redundant index can probably drop it (unless it's a UNIQUE, in which case the |
| | | | dominant index might be a better candidate for reworking). |
| | | | Reference: https://dev.mysql.com/doc/refman/8.0/en/sys-schema-redundant-indexes.html |
+----------------------+-------------------------------+---------+--------------------------------------------------------------------------------------+
🔄Online Schema Change
Perform schema changes on really large tables without locking using OSC.
Resource: https://github.com/facebookincubator/OnlineSchemaChange/wiki/How-OSC-works
MyTable MyTable_new
copy
MyTable MyTable_old
rename
MyTable
MyTable_new rename
Atomic
⏭ So what did I cowardly skip?
🙋Questions?
@thomas_shone

More Related Content

Recently uploaded

Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 

Recently uploaded (20)

Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 

Featured

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Featured (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

MySQL with  & 

  • 2. 🪠 How it started
  • 3. 󰱖People Xin Wu Simon Mudd Daniël van Eeden Mattias Jonsson Kristian Köhntopp (@isotopp)
  • 7. 👶 My First Table CREATE TABLE users ( email VARCHAR(255) NOT NULL UNIQUE, name VARCHAR(300), password VARCHAR(72), address TEXT, country_iso VARCHAR(2), language_iso VARCHAR(2), province VARCHAR(64), joined DATETIME );
  • 8. PRIMARY KEY TIMESTAMP DATETIME VARCHAR vs CHAR INDEX Singular vs Plural Language vs Locale TEXT Splitting DEFAULTs NULLiness Email length ENUMs Bcrypt Length Assumptions Normalization 🎲 Bingo Card bob.smith+tag1+tag2@gmail.com === bobsmith@gmail.com bob.smith@gmail.com === bobsmith@gmail.com
  • 9. 👶 My First Table CREATE TABLE users ( email VARCHAR(255) NOT NULL UNIQUE, name VARCHAR(255), password VARCHAR(72), address TEXT, country_iso VARCHAR(2), language_iso VARCHAR(2), province VARCHAR(64), joined DATETIME ); What engine is used? What is the primary key?
  • 10. ⚙MyISAM vs InnoDB Transactions Row level locking ACID compliant Faster reads
  • 11. SELECT TABLE_SCHEMA, TABLE_NAME FROM information_schema.TABLES WHERE ENGINE = 'myISAM'
  • 12. 👶 My First Table CREATE TABLE users ( email VARCHAR(255) NOT NULL UNIQUE, name VARCHAR(255), password VARCHAR(72), address TEXT, country_iso VARCHAR(2), language_iso VARCHAR(2), province VARCHAR(64), joined DATETIME ) ENGINE=InnoDB; What engine is used? What is a clustered index? What happens without a NOT NULL UNIQUE? Resource: https://blog.jcole.us/2013/05/02/how-does-innodb-behave-without-a-primary-key/ What is the primary key?
  • 13. SELECT database_name, table_name FROM mysql.innodb_index_stats WHERE index_name = 'GEN_CLUST_INDEX'
  • 14. So is a 🔑 PRIMARY KEY needed?
  • 15. 󰰁Relationships CREATE TABLE users ( email VARCHAR(255) NOT NULL UNIQUE, … ); CREATE TABLE posts ( email VARCHAR(255), title VARCHAR(255) NOT NULL UNIQUE, slug VARCHAR(255) NOT NULL UNIQUE, content TEXT, status ENUM('draft', 'published') );
  • 17. What to use as a PRIMARY KEY? ● INT UNSIGNED NOT NULL AUTO_INCREMENT ● UUID as BINARY(16) ● TEXT
  • 18. 📅 Dates, Times & Timezones
  • 20. Formats ● DATE ○ Min: 1000-01-01 ○ Max: 9999-12-31 ● DATETIME ○ Min: 1000-01-01 00:00:00 ○ Max: 9999-12-31 23:59:59 ● TIMESTAMP ○ Min: 1970-01-01 00:00:01 ○ Max: 2038-01-19 03:14:07 ● TIME ○ Min: -838:59:59 ○ Max: 838:59:59
  • 21. 🕑 UTC vs Timezone? Resource: https://dev.mysql.com/doc/refman/8.0/en/datetime.html
  • 22. SELECT @@GLOBAL.time_zone, @@SESSION.time_zone; SET GLOBAL time_zone = timezone; SET time_zone = timezone; Resource: https://dev.mysql.com/doc/refman/8.0/en/time-zone-support.html
  • 25. Physical Storage of InnoDB Index
  • 26. 🧮 1,048,576 rows only needs a tree depth of 10
  • 27. Physical Storage of InnoDB Index
  • 29. Is there a need? ● User complaints? ● Personal experience and frustrations? ● Resource hogging? ● Slow running processes ● Treat optimizing queries like you’d optimize anything
  • 30. > EXPLAIN SELECT * FROM users WHERE name = "Bob Smith"; +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-----------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-----------------------------+ | 1 | SIMPLE | users | NULL | ALL | NULL | NULL | NULL | NULL | 1582 | 1.0 | Using where; Using filesort | +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-----------------------------+ > EXPLAIN SELECT id, email FROM users WHERE email = "bob@smith.com"; +----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-------------+ | 1 | SIMPLE | users | NULL | INDEX | email | email | 256 | NULL | 1 | 100.0 | Using index | +----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-------------+ > EXPLAIN SELECT * FROM users WHERE id = 123456; +----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+ | 1 | PRIMARY | users | NULL | CONST | PRIMARY | PRIMARY | 4 | NULL | 1 | 100.0 | NULL | +----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+ EXPLAIN your queries Resource: https://dev.mysql.com/doc/refman/5.7/en/explain-output.html
  • 31. # Hints to use one of the following indexes SELECT * FROM table USE INDEX (index_name,...) WHERE ...; # Same as above but assumes a full table scan is very expensive SELECT * FROM table FORCE INDEX (index_name,...) WHERE ...; # Ensures that certain indexes aren’t being used SELECT * FROM table IGNORE INDEX (index_name,...) WHERE ...; Resource: https://dev.mysql.com/doc/refman/8.0/en/index-hints.html Use the FORCE INDEX
  • 32. SHOW [GLOBAL | SESSION] VARIABLES [LIKE 'pattern' | WHERE expr] SET variable = expr [, variable = expr] ... > SET slow_query_log = 1; > SET long_query_time = 10; # Default > SET min_examined_row_limit = 100; > SET log_queries_not_using_indexes = 'ON'; > SET slow_query_log_file = 'host_name-slow.log'; # Default Enable Logging Resource: https://dev.mysql.com/doc/refman/5.6/en/slow-query-log.html
  • 33. $ vim /etc/mysql/conf.d/mysql.cnf # depends on distro [mysqld] performance_schema=ON $ service mysqld restart # depends on distro too! $ mysql > SHOW VARIABLES WHERE Variable_name = 'performance_schema'; +--------------------+-------+ | Variable_name | Value | +--------------------+-------+ | performance_schema | ON | +--------------------+-------+ Performance Schema Resource: https://dev.mysql.com/doc/refman/5.7/en/performance-schema.html
  • 34. > SELECT DIGEST_TEXT, SUM_SELECT_SCAN FROM performance_schema.events_statements_summary_by_digest WHERE SUM_SELECT_SCAN > 0 AND SCHEMA_NAME = [schema] ORDER BY SUM_SELECT_SCAN DESC; +----------------------------------------------+-----------------+ | DIGEST_TEXT | SUM_SELECT_SCAN | +----------------------------------------------+-----------------+ | SELECT `*` FROM `users` WHERE ( `name` = ? ) | 700 | +----------------------------------------------+-----------------+ > SELECT DIGEST_TEXT, SUM_ROWS_EXAMINED / SUM_ROWS_SENT AS RATIO FROM performance_schema.events_statements_summary_by_digest WHERE SCHEMA_NAME = [schema] ORDER BY RATIO DESC; +---------------------------------------------------------------+-------+ | DIGEST_TEXT | RATIO | +---------------------------------------------------------------+-------+ | SELECT `*` FROM `users` WHERE ( `name` = ? AND `joined` > ? ) | 10730 | +---------------------------------------------------------------+-------+ events_statements_summary_by_digest Resource: https://dev.mysql.com/doc/refman/5.7/en/performance-schema.html
  • 36. Creation ● Combining columns ○ INDEX idx_columnA_columnB (columnA, columnB) ● Truncating columns ○ INDEX idx_columnC (columnC(72)) ● Primary key considerations ○ Primary key is post-pended on to all indexes. ○ The size of your primary key impacts your index size.
  • 37. CREATE TABLE users ( id UNSIGNED NOT NULL AUTO_INCREMENT, email VARCHAR(255), name VARCHAR(255), INDEX idx_email_name (email, name), INDEX idx_name_email (name, email), PRIMARY KEY (id) ); SELECT * TABLE users WHERE email='bob@smith.com'; Which index will be used?
  • 38. Creation CREATE TABLE users ( id INT UNSIGNED NOT NULL AUTO_INCREMENT, name VARCHAR(255), email VARCHAR(255) NOT NULL UNIQUE, password VARCHAR(72), address TEXT, country_iso VARCHAR(2), province VARCHAR(64) INDEX idx_email (email), INDEX idx_name (name), INDEX idx_country_iso (country_iso), INDEX idx_province (province), PRIMARY KEY (id) );
  • 39. Original CREATE TABLE users ( email VARCHAR(255) NOT NULL UNIQUE, name VARCHAR(255), password VARCHAR(72), address TEXT, country_iso VARCHAR(2), language_iso VARCHAR(2), province VARCHAR(64), joined DATETIME );
  • 40. Second Attempt CREATE TABLE users ( id INT UNSIGNED NOT NULL AUTO_INCREMENT, name VARCHAR(255), email VARCHAR(255) NOT NULL UNIQUE, password VARCHAR(72), address TEXT, country_iso VARCHAR(2), province VARCHAR(64) UNIQUE idx_email (email), INDEX idx_name (name(16)), INDEX idx_country_iso_province (country_iso, province(16)), PRIMARY KEY (id) );
  • 41. 🧮 How are Indexes are Selected?
  • 42. Query to Index Selection 1. MySQL will use only ONE index for the WHERE statement a. Have you hinted/forced an index? b. Do you have an index that matches the WHERE statement? c. Do you have one (or more) index(es) that matches part of the WHERE statement i. We can use left most part of multiple column indexes. ii. Index with smallest row result is picked. 2. ORDER/GROUP BY uses available indexes of the LEFT MOST column GROUP BY columnA, columnB ORDER BY columnC, columnD Resource: https://dev.mysql.com/doc/refman/8.0/en/mysql-indexes.html
  • 44. Evaluating Indexes ● Same tools as the Identification process ● Is the index redundant? SELECT table_name, redundant_index_name, dominant_index_name FROM sys.schema_redundant_indexes WHERE table_schema = [schema] ● Is the index even used? SELECT object_schema, object_name, index_name FROM performance_schema.table_io_waits_summary_by_index_usage WHERE index_name IS NOT NULL AND index_name != 'PRIMARY' AND count_star = 0 AND object_schema = [schema]
  • 46. Relationships CREATE TABLE users ( id INT UNSIGNED NOT NULL AUTO_INCREMENT, … ); CREATE TABLE posts ( id INT UNSIGNED NOT NULL AUTO_INCREMENT, author_user_id INT UNSIGNED NOT NULL, … FOREIGN KEY (author_user_id) REFERENCES users(id) ON DELETE CASCADE );
  • 47. 🔑FOREIGN KEY or not 💀FOREIGN KEY?
  • 48. Using foreign keys Cons ● Locks ● Performance ● Distributed systems (multiple databases) ● 15 year old bug (foreign keys and triggers) ● Logic in database not code Pros ● Data consistency on INSERT and DELETE ● Enforced type constraints between fields
  • 50. Cargo Culting Text column VARCHAR(255) NOT NULL DEFAULT ''
  • 51. Data Types Size of Field = header + body VARCHAR(64) = 1 byte header (to store 64) + up to 64 characters 1 character = 1-4 bytes (depending on encoding) ------------------------------------------------- Minimum = 1 byte Maximum = 257 bytes (using utf8mb4) Resource: https://dev.mysql.com/doc/refman/8.0/en/storage-requirements.html
  • 53. 🪑 MySQL Workbench Resource: https://www.mysql.com/products/workbench/
  • 54. 🗃 mysqldump Resource: https://dev.mysql.com/doc/refman/8.0/en/mysqldump.html $ vim backup_database.sh # Quick and dirty MySQL backup solution DATETIME=$(date +%Y%m%d-%H%M%S) # Which database to backup automatically DATABASE=db_name # Use file based password handling mysqldump [options] $(DATABASE) | tar cvzf > db_name.$(DATETIME).sql.tar.gz $ crontab -e # Every day at 4am 0 4 * * * backup_database.sh
  • 55. 🕵 Cadfael Resource: https://github.com/xsist10/cadfael $ composer global require cadfael/cadfael # or get the phar $ ./vendor/bin/cadfael run --host [host] # Where is the database host? --username [username] # MySQL username --performance_schema # Run performance schema checks [schema1] [...schema2] # Which schemas to check? +----------------------+-------------------------------+---------+--------------------------------------------------------------------------------------+ | Check | Entity | Status | Message | +----------------------+-------------------------------+---------+--------------------------------------------------------------------------------------+ | SaneInnoDbPrimaryKey | table_with_insane_primary_key | Warning | In InnoDB tables, the PRIMARY KEY is appended to other indexes. | | | | | If the PRIMARY KEY is big, other indexes will use more space. | | | | | Maybe turn your PRIMARY KEY into UNIQUE and add an auto_increment PRIMARY KEY. | | | | | Reference: https://dev.mysql.com/doc/refman/5.7/en/innodb-index-types.html | | EmptyTable | empty_table | Warning | Table contains no records. | | RedundantIndexes | table_with_insane_primary_key | Concern | Redundant index `full_name` (superseded by `full_name_height_in_cm`). | | | | | A redundant index can probably drop it (unless it's a UNIQUE, in which case the | | | | | dominant index might be a better candidate for reworking). | | | | | Reference: https://dev.mysql.com/doc/refman/8.0/en/sys-schema-redundant-indexes.html | +----------------------+-------------------------------+---------+--------------------------------------------------------------------------------------+
  • 56. 🔄Online Schema Change Perform schema changes on really large tables without locking using OSC. Resource: https://github.com/facebookincubator/OnlineSchemaChange/wiki/How-OSC-works MyTable MyTable_new copy MyTable MyTable_old rename MyTable MyTable_new rename Atomic
  • 57. ⏭ So what did I cowardly skip?