SlideShare a Scribd company logo
1 of 103
Download to read offline
Tutorial Day: Nov 11th,
9:00a - 12:30p
Dave Stokes
MySQL Community Manager
David.Stokes@Oracle.com
@Stoker
SQL For PHP ProgrammersSQL For PHP Programmers
2
Safe Harbor StatementSafe Harbor Statement
The following is intended to outline our general
product direction. It is intended for information
purposes only, and may not be incorporated
into any contract. It is not a commitment to
deliver any material, code, or functionality, and
should not be relied upon in making purchasing
decision. The development, release, and timing
of any features or functionality described for
Oracle’s products remains at the sole discretion
of Oracle.
3
The Problem with PHP ProgrammersThe Problem with PHP Programmers
Your are up to date on the latest version of PHP
4
The Problem with PHP ProgrammersThe Problem with PHP Programmers
Your are up to date on the latest version of PHP
The latest version of Javascript – no problemo!
5
The Problem with PHP ProgrammersThe Problem with PHP Programmers
Your are up to date on the latest version of PHP
The latest version of Javascript – no problemo!
Frameworks – you know two or three or more – plus the
ones you wrote yourself
6
The Problem with PHP ProgrammersThe Problem with PHP Programmers
Your are up to date on the latest version of PHP
The latest version of Javascript – no problemo!
Frameworks – you know two or three or more – plus the
ones you wrote yourself
But roughly 2-3% have had any training in Structured
Query Language (SQL)
7
So what is SQL?!?!??!?!??!??!
http://en.wikipedia.org/wiki/SQL
SQL (/ˈɛs kjuː ˈɛl/ or /ˈsiːkwəl/; Structured Query
Language) is a special-purpose programming language
designed for managing data held in a relational database
management system (RDBMS), or for stream processing
in a relational data stream management system
(RDSMS).
Originally based upon relational algebra and tuple
relational calculus, SQL consists of a data definition
language and a data manipulation language. The scope of
SQL includes data insert, query, update and delete,
schema creation and modification, and data access
control.
8
Oh Crap!!!
He Said 'relational
algebra' and 'tuple
relational
calculus'!
9
Run Away!!!
10
Relational algebra
http://en.wikipedia.org/wiki/Relational_algebra
Relational algebra is a family of algebra with a well-
founded semantics used for modelling the data stored in
relational databases, and defining queries on it.
To organize the data, first the redundant data and
repeating groups of data are removed, which we call
normalized. By doing this the data is organized or
normalized into what is called first normal form (1NF).
Typically a logical data model documents and
standardizes the relationships between data entities (with
its elements). A primary key uniquely identifies an
instance of an entity, also known as a record.
11
Relation Algebra Continued
Once the data is normalized and in sets of data (entities
and tables), the main operations of the relational algebra
can be performed which are the set operations (such as
union, intersection, and cartesian product), selection
(keeping only some rows of a table) and the projection
(keeping only some columns). Set operations are
performed in the where statement in SQL, which is where
one set of data is related to another set of data.
12
Database Normalization Forms
1nf
– No columns with repeated or similar data
– Each data item cannot be broken down further
– Each row is unique (has a primary key)
– Each filed has a unique name
2nf
– Move non-key attributes that only depend on part of the
key to a new table
● Ignore tables with simple keys or no no-key attributes
3nf
– Move any non-key attributes that are more dependent
on other non-key attributes than the table key to a
new table.
● Ignore tables with zero or only one non-key attribute
13
In more better English, por favor!
3NF means there are no transitive dependencies.
A transitive dependency is when two columnar
relationships imply another relationship. For example,
person -> phone# and phone# -> ringtone, so person ->
ringtone
– A → B
– It is not the case that B → A
– B → C
14
And the rarely seen 4nf & 5nf
You can break the information down further but very rarely
do you need to to 4nf or 5nf
15
So why do all this normalization?
http://databases.about.com/od/specificproducts/a/normalization.htm
Normalization is the process of efficiently
organizing data in a database. There are two
goals of the normalization process: eliminating
redundant data (for example, storing the same
data in more than one table ) and ensuring data
dependencies make sense (only storing related
data in a table). Both of these are worthy goals as
they reduce the amount of space a database
consumes and ensure that data is logically
stored.
16
Example – Cars
Name Gender Color Model
Heather F Blue Mustang
Heather F White Challenger
Eli M Blue F-type
Oscar M Blue 911
Dave M Blue Mustang
There is redundant information
across multiple rows but each
row is unique
17
2nf – split into tables
Name Gender
Heather F
Eli M
Oscar M
Dave M
Color Model Owner
Blue Mustang Heather
White Challenger Heather
Blue F-type Eli
Blue 911 Oscar
Blue Mustang Dave
Split data into
two tables –
one for owner
data and one
for car data
18
3nf – split owner and car info into different tables
Car_ID Color Model Owner
_ID
1 Blue Mustang 1
2 White Challenger 1
3 Blue F-type 2
4 Blue 911 3
5 Blue Mustang 4
The car info is
separated from the
car info. Note that
the car table has a
column for the
owner's ID from the
owner table.
Owner_ID Name Gender
1 Heather F
2 Eli M
3 Oscar M
4 Dave M
19
But what if White Mustang is shared or 4nf
Owner_ID Name Gender
1 Heather F
2 Eli M
3 Oscar M
4 Dave M
Car_id Model Color
1 Mustang Blue
2 Challenger White
3 F-type Blue
4 911 Blue
Car_id Owner_id
1 1
2 1
3 2
4 3
1 4
Tables for Owner,
Car, & Ownership
data
Now we have a flexible way to
search data about owners, cars, and
their relations.
20
So now what!!!
By normalizing to 3nf (or 4th
), we are storing the data with
no redundancies (or very, very few)
Now we need a way to define how the data is stored
And a way to manipulate it.
21
SQL
SQL is a declarative language made up of
– DDL – Data Definition Language
– DML – Data Manipulation Language
SQL was one of the first commercial languages for Edgar
F. Codd's relational model, as described in his influential
1970 paper, "A Relational Model of Data for Large Shared
Data Banks." --Wikipedia
– Codd, Edgar F (June 1970). "A Relational Model of
Data for Large Shared Data Banks". Communications
of the ACM (Association for Computing Machinery)
13 (6): 377–87. doi:10.1145/362384.362685.
Retrieved 2007-06-09.
22
Cod versus Codd
23
SQL is declarative
Describe what you want, not how to process
Hard to look at a query to tell if it is efficient by just looks
Optimizer picks GPS-like best route
– Can pick wrong – traffic, new construction, washed out
roads, and road kill! Oh my!!
24
SQL is made up of two parts
Data Definition Language (DDL)
– For defining data structures
● CREATE, DROP, ALTER, and
RENAME
Data Manipulation Language
– Used to SELECT, INSERT, DELETE, and
UPDATE data
25
DDL
Useful commands
– DESC[ribe] table
– SHOW CREATE TABLE table
–
26
The stuff in the parenthesis
CHAR(30) or VARCHAR(30) will hold strings up to 30
character long.
– SQL MODE (more later) tells server to truncate or
return error if value is longer that 30 characters
–
INT(5) tells the server to show five digits of data
DECIMAL(5,3) stores five digits with two decimals, i.e.
-99.999 to 99.999
FLOAT(7,4) -999.9999 to 999.9999
27
Another look at DESC City
28
NULL No Value
Null is used to indicate a lack of value or no data
– Gender : Male, Female, NULL
Nulls are very messy in B-tree Indexing, try to avoid
Math with NULLs is best avoided
29
DESC City in detail
Describe table tells us the names of the columns (Fields),
the data type, if the column is NULLABLE, Keys, any
default value, and Extras.
30
Data Types
Varies with vendor
Usually have types for text, integers, BLOBs, etc.
Refer to manual
31
MySQL World Database
http://dev.mysql.com/doc/index-other.html
Used in MySQL documentation, books, on line tutorials,
etc.
Three tables
– City
– Country
– Country Language
32
EER Map
33
Simple SELECT
Query
Result
34
Join two tables
To get a query that provides the names of the City and the
names of the countries, JOIN the two tables on a common
data between the two columns (that are hopefully
indexed!)
35
http://i.imgur.com/hhRDO4d.png – Get a copy!!!
36
Simple join
Both City and Country
have columns that
can be used for JOINs
– Country.Code
– City.CountryCode
37
What happened when you send a query
Server receives the query
The user is authenticated for permissions
– Database, table, and/or column level
Syntax
Optimizer
– Statistics on data
– Cost model
● Pick cheapest option (DISK I/O)
● Cardinality of indexes
Get the data
Sorting/Grouping/etc
Data returned
38
EXPLAIN
EXPLAIN is pre pended to the query to show the results
from the optimizer
39
VISUAL Explain
MySQL Workbench
MySQL 5.6/5.7
Uses JSON output from
EXPLAIN and turns it into
something more visually
appealing
40
Optimizer Trace
{
"query_block": {
"select_id": 1,
"cost_info": {
"query_cost": "5779.32"
},
"nested_loop": [
{
"table": {
"table_name": "City",
"access_type": "ALL",
"rows_examined_per_scan": 4079,
"rows_produced_per_join": 4079,
"filtered": 100,
"cost_info": {
"read_cost": "68.72",
"eval_cost": "815.80",
"prefix_cost": "884.52",
"data_read_per_join": "286K"
},
"used_columns": [
"Name",
"CountryCode",
"Population"
]
}
},
{
"table": {
"table_name": "Country",
"access_type": "eq_ref",
"possible_keys": [
"PRIMARY"
],
"key": "PRIMARY",
"used_key_parts": [
"Code"
],
"key_length": "3",
"ref": [
"world.City.CountryCode"
],
"rows_examined_per_scan": 1,
"rows_produced_per_join": 4079,
"filtered": 100,
"cost_info": {
"read_cost": "4079.00",
"eval_cost": "815.80",
"prefix_cost": "5779.32",
"data_read_per_join": "1M"
},
"used_columns": [
"Code",
"Name"
41
MySQL Internals Manual :: 7 The Optimizer :: 7.1 Code
and Concepts :: 7.1.2 The Optimizer Code
handle_select()
mysql_select()
JOIN::prepare()
setup_fields()
JOIN::optimize() /* optimizer is from here ... */
optimize_cond()
opt_sum_query()
make_join_statistics()
get_quick_record_count()
choose_plan()
/* Find the best way to access tables */
/* as specified by the user. */
optimize_straight_join()
best_access_path()
/* Find a (sub-)optimal plan among all or subset */
/* of all possible query plans where the user */
/* controls the exhaustiveness of the search. */
greedy_search()
best_extension_by_limited_search()
best_access_path()
/* Perform an exhaustive search for an optimal plan */
find_best()
make_join_select() /* ... to here */
JOIN::exec()
42
Data and Data types
Use the smallest reasonable field
– BIGINT are not needed for customer id numbers
– Signed/unsigned
● Customer “negative seven four two four three”
– All those extra bits have to be moved disk → memory
→ buffer → Ether → buffer → program
CHAR versus VARCHAR
– Space and compression
– Overhead slight
ENUMs and BITs
– Have to plan ahead
– Sorting issues
● Value of the ENUM
43
INTEGER, INT, SMALLINT, TINYINT, MEDIUMINT, BIGINT
Type Storage
(bytes)
Minimum
Signed
Maximum
Signed
Minimum
Unsigned
Maximum
Unsigned
TINYINT 1 -128 127 0 255
SMALLINT 2 -32768 32767 0 65535
MEDIUMINT 3 -8388608 8388607 0 16777215
INT 4 -
2147483648
2147483647 0 4294967295
BIGINT 8 -
9223372036
854775808
9223372036
854775807
0 1844674407
3709551615
44
Creating a table with MySQL command line client
CREATE TABLE fooint (a INT(1), b INT(4), c iINT(10));
INSERT INTO fooint (a,b,c) VALUES (1,100,10000);
INSERT INTO fooint VALUES (777,777,0);
SELECT * from foointg
INSERT INTO fooint (a) values (1234567890);
SELECT * FROM foointG
Note that ; and g are equivalent and G is for vertical
output.
What happens with INSERT INTO fooint (a) values
(12345678900); ??
45
ALTER, TRUNCATE, and DROP table
ALTER TABLE fooint ADD COLUMN id INT UNSIGNED
NOT NULL FIRST;
– Also BEFORE and AFTER
SELECT * FROM fooint;
TRUNCATE fooint;
– Used to remove data but not schema definition
DROP TABLE fooint;
– Goodbye to table and schema
46
Another table
CREATE TABLE fooint (id INT UNSIGNED NOT NULL
PRIMARY KEY AUTO_INCREMENT, name Char(30));
DESC fooint; and SHOW CREATE TABLE foointl
INSERT INTO fooint (name) VALUES ('Alpha'),('Beta'),
('Gamma');
SELECT * FROM fooint;
– Note the id column incremented automatically
47
Foreign Keys
CREATE TABLE employee (
e_id INT NOT NULL,
name CHAR(20),
PRIMARY KEY (e_id)
);
CREATE TABLE building (
office_nbr INT NOT NULL,
description CHAR(20),
e_id INT NOT NULL,
PRIMARY KEY (office_nbr),
FOREIGN KEY (e_id)
REFERENCES employee
(e_id)
ON UPDATE CASCADE
ON DELETE CASCADE);
48
More on foreign keys
mysql> INSERT INTO
employee VALUES (10,'Larry'),
(20,'Shemp'), (40,'Moe');
Query OK, 3 rows affected (0.04
sec)
Records: 3 Duplicates: 0
Warnings: 0
mysql> INSERT INTO building
VALUES (100,'Corner
Office',10), (101,'Lobby',40);
Query OK, 2 rows affected (0.04
sec)
Records: 2 Duplicates: 0
Warnings: 0
mysql> SELECT * FROM employee;
+------+-------+
| e_id | name |
+------+-------+
| 10 | Larry |
| 20 | Shemp |
| 40 | Moe |
+------+-------+
3 rows in set (0.00 sec)
mysql> SELECT * FROM building;
+------------+---------------+------+
| office_nbr | description | e_id |
+------------+---------------+------+
| 100 | Corner Office | 10 |
| 101 | Lobby | 40 |
+------------+---------------+------+
2 rows in set (0.00 sec)
49
Using foreign keys
mysql> SELECT * FROM employee JOIN building ON
(employee.e_id=building.e_id);
+------+-------+------------+---------------+------+
| e_id | name | office_nbr | description | e_id |
+------+-------+------------+---------------+------+
| 10 | Larry | 100 | Corner Office | 10 |
| 40 | Moe | 101 | Lobby | 40 |
+------+-------+------------+---------------+------+
2 rows in set (0.02 sec)
50
Left Join with foreign keys
mysql> SELECT * FROM employee LEFT JOIN building ON
(employee.e_id=building.e_id);
+------+-------+------------+---------------+------+
| e_id | name | office_nbr | description | e_id |
+------+-------+------------+---------------+------+
| 10 | Larry | 100 | Corner Office | 10 |
| 40 | Moe | 101 | Lobby | 40 |
| 20 | Shemp | NULL | NULL | NULL |
+------+-------+------------+---------------+------+
3 rows in set (0.00 sec)
51
Foreign keys keep you from messy data
mysql> INSERT INTO building VALUES (120,'Cubicle',77);
ERROR 1452 (23000): Cannot add or update a child row: a
foreign key constraint fails (`test`.`building`, CONSTRAINT
`building_ibfk_1` FOREIGN KEY (`e_id`) REFERENCES
`employee` (`e_id`) ON DELETE CASCADE ON UPDATE
CASCADE)
mysql>
52
Taking advantage of CASCADE
mysql> DELETE FROM employee WHERE e_id=40;
Query OK, 1 row affected (0.08 sec)
mysql> SELECT * FROM employee LEFT JOIN building ON
(employee.e_id=building.e_id);
+------+-------+------------+---------------+------+
| e_id | name | office_nbr | description | e_id |
+------+-------+------------+---------------+------+
| 10 | Larry | 100 | Corner Office | 10 |
| 20 | Shemp | NULL | NULL | NULL |
+------+-------+------------+---------------+------+
53
Cascade keeping foreign key data updated
mysql> UPDATE employee SET e_id=21 WHERE e_id=20;
Query OK, 1 row affected (0.04 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> SELECT * FROM employee LEFT JOIN building ON
(employee.e_id=building.e_id);
+------+-------+------------+---------------+------+
| e_id | name | office_nbr | description | e_id |
+------+-------+------------+---------------+------+
| 10 | Larry | 100 | Corner Office | 10 |
| 21 | Shemp | NULL | NULL | NULL |
+------+-------+------------+-------------
54
Indexing
MySQL uses mainly B-trees for indexes
Great for =, <, >, <=, =>, BETWEEN, some LIKE 'string%'
55
13.1.8 CREATE INDEX Syntax
CREATE [UNIQUE|FULLTEXT|SPATIAL] INDEX index_name
[index_type]
ON tbl_name (index_col_name,...)
[index_option]
[algorithm_option | lock_option] ...
index_col_name:
col_name [(length)] [ASC | DESC]
index_type:
USING {BTREE | HASH}
index_option:
KEY_BLOCK_SIZE [=] value
| index_type
| WITH PARSER parser_name
| COMMENT 'string'
algorithm_option:
ALGORITHM [=] {DEFAULT|INPLACE|COPY}
lock_option:
LOCK [=] {DEFAULT|NONE|SHARED|EXCLUSIVE}
56
You may not know
Can create multi-column indexes
– Year,month,day index can be used for
● Year,Month,Day searches
● Year,Month searches
● Year searches
– But not day or month, day searches
Can index on part of a column
– CREATE INDEX ndx1 ON customer (name(10));
NULLs are to be avoided
– Extra optimizer steps
UNIQUE indexes enforce no duplicates
InnoDB will create an index if YOU DO NOT
57
13.2.5.2 INSERT … ON DUPLICATE KEY UPDATE
If you specify ON DUPLICATE KEY UPDATE, and a row
is inserted that would cause a duplicate value in a
UNIQUE index or PRIMARY KEY, an UPDATE of the old
row is performed. For example, if column a is declared as
UNIQUE and contains the value 1, the following two
statements have identical effect:
INSERT INTO table (a,b,c) VALUES (1,2,3)
ON DUPLICATE KEY UPDATE c=c+1;
58
FIND Asian countries with Population > 5 million
EXPLAIN SELECT * FROM Country
WHERE Continent = 'Asia'
AND Population > 5000000;
59
Will indexes make query faster?
SELECT * FROM Country
WHERE Continent = 'Asia'
AND Population > 5000000;
Index columns
– Population
– Continent
– Population and Continent
– Continent and Population
● Which to pick? Optimizer will look at option that takes
the least CPU and I/O
● Statistics from storage engine
– Indexes to use, order to join tables, avoiding sorts,
is sorting expensive
60
Index on Population
Alter TABLE Country ADD INDEX p (Population)
Down from 239!!
61
A reading from the Book of MySQL – Chapter 8 Optimization
READ this section of the MySQL Manual
– Hardware
– Configuration
– Queries
– Locking
– Benchmarkings
Set InnoDb Buffer Cache Size to 75-80% of system RAM
Move Logs off disks/controllers with data
Use bigger stronger machines for replication slaves
Looking at SSDs, FusionIO cards
Monitor! Monitor! Monitor!
62
ALTER TABLE Country ADD INDEX C (continent)
42 is better than
239 or 54
63
CREATE INDEX cpidx ON Country (continent,
Population)
Event better still!
64
CREATE INDEX pcidx on Country (Population,continent)
FORCING the use of the
new index shows
it is not optimal.
65
Index-es
Indexes need maintenance
– Run OPTIMIZE TABLE when system quiet
– Each INSERT/DELETE takes overhead
● Slows you down
– Therefore remove unused indexes
– MySQL Utilities mysqlindexcheck to look for
unused index on long running systems (do not
use AFTER a restart)
● Use good naming convention, document
– Statistics can be saved/reloaded at shutdown/reboot
● After a reboot w/o saving, the system is going to
need to rebuild stats from scratch, run slower
– Log queries not using Indexes
● Not all of these are bad, just recognize them
66
mysqlindexcheck
67
Transactions
You will need to use a
transactional storage
engine like InnoDB, NDB
You need to START a
transaction, do the work
and COMMIT to record the
changes or ROLLBACK to
cancel the recording.
To avoid using ROLLBACK, you can
employ the following strategy:
– Use LOCK TABLES to lock
all the tables you want to
access.
– Test the conditions that must
be true before performing
the update.
– Update if the conditions are
satisfied.
– Use UNLOCK TABLES to
release your locks.
Note
– This solution does not
handle the situation when
someone kills the threads
in the middle of an
update. In that case, all
locks are released but
some of the updates may
not have been executed.
68
CREATE TABLE account (id int not null, balance
decimal(6,2) default 0)
INSERT INTO account VALUES (1,1000.10), (2,400),
(3,0), (15,.99);
START TRANSACTION;
UPDATE account SET balance = 1010.10 WHERE id = 1;
UPDATE account SET balance = 300 WHERE id=2;
COMMIT;
START TRANSACTION
UPDATE account SET balance=1000 WHERE ID=3;
ROLLBACK;
69
AUTOCOMMIT
AUTOCOMMIT is set to 1 by default or 'do things as I type
them mode'.
– START TRANSACTION overrides
– Some APIs like JDBC have own way to handle
transactions (see chapter 23 in MySQL docs)
AUTOCOMMIT set to 0 requires a COMMIT to store
changes
Some statements cannot be rolled back. In general, these
include data definition language (DDL) statements, such
as those that create or drop databases, those that create,
drop, or alter tables or stored routines
Some statements cause an implicit commit – DDL, user
account changes, transaction control or locking
statements, data loading, and replication control
statements,
70
SET autocommit=0;
LOCK TABLES t1 WRITE, t2 READ, ...;
... do something with tables t1 and t2 here ...
COMMIT;
UNLOCK TABLES;
Note ROLLBACKS do not release locks
71
SET autocommit=0;
LOCK TABLES t1 WRITE, t2 READ, ...;
... do something with tables t1 and t2 here ...
COMMIT;
UNLOCK TABLES;
Note ROLLBACKS do not release locks
72
Scaling MySQL
73
First steps of scaling
✔ Upgrade MySQL versions
✔ 5.5 is 25% faster than 5.1
✔ 5.6 is 20% faster
✔ 5.7 will be faster still
✔ Memory
✔ DISKS
✔ Move logs to different platters, controllers
Move large databases to own disks
✔ TUNE queries
✔ Scale horizontally especially is server is old
✔ Use bigger, more powerful boxes for replicants
✔ Optimize network
74
Read/Write Splitting – Scale READS
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.75
MySQL Fabric
HA, shards,
Write scaling
●
Lets you shard and
reshard
●
Very new
●
Looking for
feedback
●
Released in May
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.76
Why does that query take so LONG!!!
● I only want the top N
customers!
● And I want them sorted
by last name, first
name, middle name,
postal code, and
average purchase!
The stuff on the right
hand side of the
WHERE clause may
seem like a great way
to prune down the
number of records for a
query BUT that is not
always the case.
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.77
Query we want to 'prune'
Select City.name AS 'City',
Country.name AS 'Country',
City.Population AS 'Pop'
FROM City
JOIN Country ON
(City.CountryCode = Country.Code)
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.78
The Stats!
Total cost is about
5780. Bad news is
that we have a Full
Table Scan on City
but that was expected.
Good news is that we
have 1 read on
Country using the
Primary key to get
record we need to
match on the join.
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.79
Now we only want the top ten!
LIMIT 10
But the cost is still
5780!?!?!
The server has to run
through ALL the
records to get just the
first ten. So LIMIT
does not buy us any
speed in this case.
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.80
ORDER BY Country
Oh no! The
cost jumps to
9858! And
we have to
open a temp
table and
then sort that
file
(expensive)
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.81
WHERE City.Population > 1000000
Nope. The
server is doing
the join first
and then
tossing away
the records
that are <=
1000000
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.82
SELECT a.name FROM
(select City.name, CountryCode,
Population from City where Population
> 1000000) as a
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.83
SELECT a.name FROM (select
City.name, CountryCode, Population
from City where Population >
1000000) as a
JOIN Country ON (a.CountryCode =
Country.code)
That is an awful
lot of work to try
to save a few
hundred and
not very pretty.
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.84
The World is too small
● The World database is too small and already too
optimized to gain much more performance.
● So lets create a new database and copy the date to
get 100K+ records;
– CREATE TABLE bigcity LIKE city;
– INSERT INTO bigcity SELECT * FROM city;
INSERT INTO bigcity SELECT
NULL,Name,CountryCode,District,Population FROM
bigcity; <many times>
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.85
SELECT countrycode, sum(Population)
FROM bigcity
GROUP BY CountryCode;
● Takes .343 seconds
● ALTER TABLE bigcity ADD INDEX
cc_pop (CountryCode,Population);
● Now takes .156 seconds
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.86
Zipcode – from a free zipcode database
SELECT city, state, zipcode
FROM zip
WHERE zipcode = '76262';
Pretend out app provides the us
postal code given the city and state
data. Here no indexes.
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.87
Indexing city and state only.
● CREATE INDEX csidx ON
zip (city(8),state(2)); /* only use first 8 on name and 2
for state*/
● SELECT city, state, zipcode
FROM zip
WHERE city='Roanoke' and state='TX';
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.88
Good optimization
The optimizer uses the
city and state from the
index we created to
know where to dive into
the data. Some indexes
will be able to supply all
the data so there will be
no need to dive into the
data.
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.89
Composite and Compound Indexes
● Composite index covers all the columns in the filter,
join and select criteria. All of these columns are
stored on all of the index pages accordingly in the
index B-tree.
● Compound index covers all the filter and join key
columns in the B-tree, keeping the select columns
only on the leaf pages as they will not be searched,
only extracted! This saves space and creates
smaller index pages (and is faster).
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.90
NoSQL
● No Schema
– Teenager with all belongings in center of room
– Does not scale easily
– ACID compliance not important
– Effective for some jobs but watch out if you need an
AND or an OR
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.91
http://en.wikipedia.org/wiki/NoSQL
A NoSQL (often interpreted as Not Only SQL) database
provides a mechanism for storage and retrieval of data
that is modeled in means other than the tabular relations
used in relational databases. Motivations for this
approach include simplicity of design, horizontal scaling
and finer control over availability. The data structure (e.g.
key-value, graph, or document) differs from the RDBMS,
and therefore some operations are faster in NoSQL and
some in RDBMS. There are differences though, and the
particular suitability of a given NoSQL DB depends on
the problem it must solve (e.g., does the solution use
graph algorithms?).
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.92
Key/value pair
● A key-value pair (KVP) is a set of two linked data
items: a key, which is a unique identifier for some
item of data, and the value, which is either the data
that is identified or a pointer to the location of that
data.
http://searchenterprisedesktop.techtarget.com/definition/key-value-pair
● Berkeley Database (Sleepycat) 1986
– Hash table
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.93
Graph Databases
● In computing, a graph database is a database that
uses graph structures with nodes, edges, and
properties to represent and store data. A graph
database is any storage system that provides index-
free adjacency. This means that every element
contains a direct pointer to its adjacent elements
and no index lookups are necessary
http://en.wikipedia.org/wiki/Graph_database
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.94
Document Databases
A document-oriented database is a computer program
designed for storing, retrieving, and managing
document-oriented information, also known as semi-
structured data. Document-oriented databases are one
of the main categories of NoSQL databases and the
popularity of the term "document-oriented database"
(or "document store") has grown with the use of the
term NoSQL itself. In contrast to relational databases
and their notions of"Tables" and "Relations" (between
Tables), these systems are designed around an
abstract notion of a "Document".
http://en.wikipedia.org/wiki/Document-oriented_database
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.95
Hybrids
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.96
Bypassing syntax checking and
optimizer upto 9x faster than SQL
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.97
Paradigm Shifting w/o a clutch
● SQL is a declarative language
● PHP is procedural/object orientated
● Square Pegs for Round Holes
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.98
Object Relation Mapper
● Adds another layer
– Easier to learn SQL
– Not as expressive as SQL
– ORMs are a hammer in a world of nails and blasting
caps
– If you can not solve problem with an object, adding
another object is not going to help
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.99
Query Rewrite plugin or How To Fix
your ORM's lousy SQL
mysql < <plugin directory>/install_rewriter_plugin.sql
INSERT INTO query_rewrite.rewrite_rules( pattern,
replacement )
VALUES ( 'SELECT ?', 'SELECT ? + 1' );
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.100
Better Example
● SELECT patientId, time
FROM patient_records
WHERE patientId > 1000 AND patientId < 2000
AND time < “2014-09-08” AND illness = ”Headache”
ORDER BY time LIMIT 1;
● SELECT patiendId, time
FROM patient_records
WHERE patientId > 1000 AND patientID < 2000
AND time < “2014-09-08” AND illness =
”Acrocephalosyndactylia” *
ORDER BY time LIMIT 1;
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.101
The Rules for the rewrite
INSERT INTO query_rewrite.rewrite_rules ( pattern, pattern_database, replacement )
VALUES
(
# Pattern
'SELECT patientid, time ' ||
'FROM patient_records ' ||
'WHERE patientid > ? AND patientid < ? ' ||
'AND time < ? AND illness = ? ' ||
'ORDER BY time LIMIT 1',
# Database
'health_clinic',
# Replacement
'SELECT patientid, time ' ||
'FROM patient_records FORCE INDEX(patientIdIdx) ' ||
'WHERE patientid > ? AND patientid < ? ' ||
'AND time < ? AND illness = ? ' ||
'ORDER BY time LIMIT 1');
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.102
Recommended Books
●
High Performance MySQL, 3rd
Edition
Baron Schwartz, Peter Zaitsev & Vadim
Tkachenko
● SQL Antipatterns
– Bill Karwin
● SQL and Relational Theory
– CJ Date
● Effective MySQL Replication Techniques in Depth
– Ronald Bradford
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.103
Q/A
● Slideshare.net/davestokes
● @stoker
● David.Stokes@Oracle.com

More Related Content

What's hot

Relational database
Relational  databaseRelational  database
Relational databaseamkrisha
 
Chapter 2 Relational Data Model-part 2
Chapter 2 Relational Data Model-part 2Chapter 2 Relational Data Model-part 2
Chapter 2 Relational Data Model-part 2Eddyzulham Mahluzydde
 
Database normalization
Database normalizationDatabase normalization
Database normalizationJignesh Jain
 
Types of Tree in Data Structure in C++
Types of Tree in Data Structure in C++Types of Tree in Data Structure in C++
Types of Tree in Data Structure in C++Himanshu Choudhary
 
Advanced Normalization
Advanced NormalizationAdvanced Normalization
Advanced NormalizationAbdullah Khosa
 
Cs511 data-extraction
Cs511 data-extractionCs511 data-extraction
Cs511 data-extractionBorseshweta
 
Normalization in Database
Normalization in DatabaseNormalization in Database
Normalization in DatabaseRoshni Singh
 
Chapter 2 Relational Data Model-part 3
Chapter 2 Relational Data Model-part 3Chapter 2 Relational Data Model-part 3
Chapter 2 Relational Data Model-part 3Eddyzulham Mahluzydde
 
Normalization 1 nf,2nf,3nf,bcnf
Normalization 1 nf,2nf,3nf,bcnf Normalization 1 nf,2nf,3nf,bcnf
Normalization 1 nf,2nf,3nf,bcnf Shriya agrawal
 
Understanding about relational database m-square systems inc
Understanding about relational database m-square systems incUnderstanding about relational database m-square systems inc
Understanding about relational database m-square systems incMuthu Natarajan
 
Duplicate Detection in Hierarchical Data Using XPath
Duplicate Detection in Hierarchical Data Using XPathDuplicate Detection in Hierarchical Data Using XPath
Duplicate Detection in Hierarchical Data Using XPathiosrjce
 

What's hot (19)

Databases: Normalisation
Databases: NormalisationDatabases: Normalisation
Databases: Normalisation
 
Relational database
Relational  databaseRelational  database
Relational database
 
Chapter 2 Relational Data Model-part 2
Chapter 2 Relational Data Model-part 2Chapter 2 Relational Data Model-part 2
Chapter 2 Relational Data Model-part 2
 
Normalization in DBMS
Normalization in DBMSNormalization in DBMS
Normalization in DBMS
 
Database normalization
Database normalizationDatabase normalization
Database normalization
 
Types of Tree in Data Structure in C++
Types of Tree in Data Structure in C++Types of Tree in Data Structure in C++
Types of Tree in Data Structure in C++
 
Dbms 4NF & 5NF
Dbms 4NF & 5NFDbms 4NF & 5NF
Dbms 4NF & 5NF
 
Advanced Normalization
Advanced NormalizationAdvanced Normalization
Advanced Normalization
 
Cs511 data-extraction
Cs511 data-extractionCs511 data-extraction
Cs511 data-extraction
 
Ch3rerevised
Ch3rerevisedCh3rerevised
Ch3rerevised
 
Normalization in Database
Normalization in DatabaseNormalization in Database
Normalization in Database
 
Chapter 2 Relational Data Model-part 3
Chapter 2 Relational Data Model-part 3Chapter 2 Relational Data Model-part 3
Chapter 2 Relational Data Model-part 3
 
Normalization 1 nf,2nf,3nf,bcnf
Normalization 1 nf,2nf,3nf,bcnf Normalization 1 nf,2nf,3nf,bcnf
Normalization 1 nf,2nf,3nf,bcnf
 
Database Presentation
Database PresentationDatabase Presentation
Database Presentation
 
Tree in data structure
Tree in data structureTree in data structure
Tree in data structure
 
Understanding about relational database m-square systems inc
Understanding about relational database m-square systems incUnderstanding about relational database m-square systems inc
Understanding about relational database m-square systems inc
 
Normalization in DBMS
Normalization in DBMSNormalization in DBMS
Normalization in DBMS
 
Dbms
DbmsDbms
Dbms
 
Duplicate Detection in Hierarchical Data Using XPath
Duplicate Detection in Hierarchical Data Using XPathDuplicate Detection in Hierarchical Data Using XPath
Duplicate Detection in Hierarchical Data Using XPath
 

Similar to SQL for PHP Programmers -- Dallas PHP Users Group Jan 2015

Lecture 13
Lecture 13Lecture 13
Lecture 13Shani729
 
In Memory Database Essay
In Memory Database EssayIn Memory Database Essay
In Memory Database EssayTammy Moncrief
 
SQL Complete Tutorial. All Topics Covered
SQL Complete Tutorial. All Topics CoveredSQL Complete Tutorial. All Topics Covered
SQL Complete Tutorial. All Topics CoveredDanish Mehraj
 
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobilNLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobilDatabricks
 
Database management system overview
Database management system overviewDatabase management system overview
Database management system overviewNj Saini
 
Data Warehouse ( Dw Of Dwh )
Data Warehouse ( Dw Of Dwh )Data Warehouse ( Dw Of Dwh )
Data Warehouse ( Dw Of Dwh )Jenny Calhoon
 
Pandas vs. SQL – Tools that Data Scientists use most often.pdf
Pandas vs. SQL – Tools that Data Scientists use most often.pdfPandas vs. SQL – Tools that Data Scientists use most often.pdf
Pandas vs. SQL – Tools that Data Scientists use most often.pdfData Science Council of America
 
Advantages And Uses Of SQL
Advantages And Uses Of SQLAdvantages And Uses Of SQL
Advantages And Uses Of SQLSandra Arveseth
 
dbms final.pdf
dbms final.pdfdbms final.pdf
dbms final.pdfRaKesh3868
 
An overview of fragmentation
An overview of fragmentationAn overview of fragmentation
An overview of fragmentationcsandit
 
Intro databases (Table, Record, Field)
Intro databases (Table, Record, Field)Intro databases (Table, Record, Field)
Intro databases (Table, Record, Field)Maryam Fida
 
PostgreSQL Tutorial for Beginners | Edureka
PostgreSQL Tutorial for Beginners | EdurekaPostgreSQL Tutorial for Beginners | Edureka
PostgreSQL Tutorial for Beginners | EdurekaEdureka!
 

Similar to SQL for PHP Programmers -- Dallas PHP Users Group Jan 2015 (20)

Lecture 13
Lecture 13Lecture 13
Lecture 13
 
In Memory Database Essay
In Memory Database EssayIn Memory Database Essay
In Memory Database Essay
 
T-SQL Overview
T-SQL OverviewT-SQL Overview
T-SQL Overview
 
51095137-Semantic-WEB.ppt
51095137-Semantic-WEB.ppt51095137-Semantic-WEB.ppt
51095137-Semantic-WEB.ppt
 
PO WER - Piotr Mariat - Sql
PO WER - Piotr Mariat - SqlPO WER - Piotr Mariat - Sql
PO WER - Piotr Mariat - Sql
 
SQL Complete Tutorial. All Topics Covered
SQL Complete Tutorial. All Topics CoveredSQL Complete Tutorial. All Topics Covered
SQL Complete Tutorial. All Topics Covered
 
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobilNLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
 
D B M S Animate
D B M S AnimateD B M S Animate
D B M S Animate
 
RDMS AND SQL
RDMS AND SQLRDMS AND SQL
RDMS AND SQL
 
Database management system overview
Database management system overviewDatabase management system overview
Database management system overview
 
Data Warehouse ( Dw Of Dwh )
Data Warehouse ( Dw Of Dwh )Data Warehouse ( Dw Of Dwh )
Data Warehouse ( Dw Of Dwh )
 
Pandas vs. SQL – Tools that Data Scientists use most often.pdf
Pandas vs. SQL – Tools that Data Scientists use most often.pdfPandas vs. SQL – Tools that Data Scientists use most often.pdf
Pandas vs. SQL – Tools that Data Scientists use most often.pdf
 
Advantages And Uses Of SQL
Advantages And Uses Of SQLAdvantages And Uses Of SQL
Advantages And Uses Of SQL
 
dbms final.pdf
dbms final.pdfdbms final.pdf
dbms final.pdf
 
4.Database Management System.pdf
4.Database Management System.pdf4.Database Management System.pdf
4.Database Management System.pdf
 
AQA Computer science easter revision
AQA Computer science easter revisionAQA Computer science easter revision
AQA Computer science easter revision
 
An overview of fragmentation
An overview of fragmentationAn overview of fragmentation
An overview of fragmentation
 
Intro databases (Table, Record, Field)
Intro databases (Table, Record, Field)Intro databases (Table, Record, Field)
Intro databases (Table, Record, Field)
 
nosql.pptx
nosql.pptxnosql.pptx
nosql.pptx
 
PostgreSQL Tutorial for Beginners | Edureka
PostgreSQL Tutorial for Beginners | EdurekaPostgreSQL Tutorial for Beginners | Edureka
PostgreSQL Tutorial for Beginners | Edureka
 

More from Dave Stokes

Locking Down Your MySQL Database.pptx
Locking Down Your MySQL Database.pptxLocking Down Your MySQL Database.pptx
Locking Down Your MySQL Database.pptxDave Stokes
 
Linuxfest Northwest 2022 - MySQL 8.0 Nre Features
Linuxfest Northwest 2022 - MySQL 8.0 Nre FeaturesLinuxfest Northwest 2022 - MySQL 8.0 Nre Features
Linuxfest Northwest 2022 - MySQL 8.0 Nre FeaturesDave Stokes
 
MySQL Indexes and Histograms - RMOUG Training Days 2022
MySQL Indexes and Histograms - RMOUG Training Days 2022MySQL Indexes and Histograms - RMOUG Training Days 2022
MySQL Indexes and Histograms - RMOUG Training Days 2022Dave Stokes
 
MySQL 8.0 Features -- Oracle CodeOne 2019, All Things Open 2019
MySQL 8.0 Features -- Oracle CodeOne 2019, All Things Open 2019MySQL 8.0 Features -- Oracle CodeOne 2019, All Things Open 2019
MySQL 8.0 Features -- Oracle CodeOne 2019, All Things Open 2019Dave Stokes
 
Windowing Functions - Little Rock Tech fest 2019
Windowing Functions - Little Rock Tech fest 2019Windowing Functions - Little Rock Tech fest 2019
Windowing Functions - Little Rock Tech fest 2019Dave Stokes
 
MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019
MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019
MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019Dave Stokes
 
Develop PHP Applications with MySQL X DevAPI
Develop PHP Applications with MySQL X DevAPIDevelop PHP Applications with MySQL X DevAPI
Develop PHP Applications with MySQL X DevAPIDave Stokes
 
MySQL 8 Tips and Tricks from Symfony USA 2018, San Francisco
MySQL 8 Tips and Tricks from Symfony USA 2018, San FranciscoMySQL 8 Tips and Tricks from Symfony USA 2018, San Francisco
MySQL 8 Tips and Tricks from Symfony USA 2018, San FranciscoDave Stokes
 
The Proper Care and Feeding of MySQL Databases
The Proper Care and Feeding of MySQL DatabasesThe Proper Care and Feeding of MySQL Databases
The Proper Care and Feeding of MySQL DatabasesDave Stokes
 
MySQL without the SQL -- Cascadia PHP
MySQL without the SQL -- Cascadia PHPMySQL without the SQL -- Cascadia PHP
MySQL without the SQL -- Cascadia PHPDave Stokes
 
MySQL 8 Server Optimization Swanseacon 2018
MySQL 8 Server Optimization Swanseacon 2018MySQL 8 Server Optimization Swanseacon 2018
MySQL 8 Server Optimization Swanseacon 2018Dave Stokes
 
MySQL Without The SQL -- Oh My! PHP[Tek] June 2018
MySQL Without The SQL -- Oh My! PHP[Tek] June 2018MySQL Without The SQL -- Oh My! PHP[Tek] June 2018
MySQL Without The SQL -- Oh My! PHP[Tek] June 2018Dave Stokes
 
Presentation Skills for Open Source Folks
Presentation Skills for Open Source FolksPresentation Skills for Open Source Folks
Presentation Skills for Open Source FolksDave Stokes
 
MySQL Without the SQL -- Oh My! Longhorn PHP Conference
MySQL Without the SQL -- Oh My!  Longhorn PHP ConferenceMySQL Without the SQL -- Oh My!  Longhorn PHP Conference
MySQL Without the SQL -- Oh My! Longhorn PHP ConferenceDave Stokes
 
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)Dave Stokes
 
ConFoo MySQL Replication Evolution : From Simple to Group Replication
ConFoo  MySQL Replication Evolution : From Simple to Group ReplicationConFoo  MySQL Replication Evolution : From Simple to Group Replication
ConFoo MySQL Replication Evolution : From Simple to Group ReplicationDave Stokes
 
Advanced MySQL Query Optimizations
Advanced MySQL Query OptimizationsAdvanced MySQL Query Optimizations
Advanced MySQL Query OptimizationsDave Stokes
 
Making MySQL Agile-ish
Making MySQL Agile-ishMaking MySQL Agile-ish
Making MySQL Agile-ishDave Stokes
 
PHP Database Programming Basics -- Northeast PHP
PHP Database Programming Basics -- Northeast PHPPHP Database Programming Basics -- Northeast PHP
PHP Database Programming Basics -- Northeast PHPDave Stokes
 
MySQL 101 PHPTek 2017
MySQL 101 PHPTek 2017MySQL 101 PHPTek 2017
MySQL 101 PHPTek 2017Dave Stokes
 

More from Dave Stokes (20)

Locking Down Your MySQL Database.pptx
Locking Down Your MySQL Database.pptxLocking Down Your MySQL Database.pptx
Locking Down Your MySQL Database.pptx
 
Linuxfest Northwest 2022 - MySQL 8.0 Nre Features
Linuxfest Northwest 2022 - MySQL 8.0 Nre FeaturesLinuxfest Northwest 2022 - MySQL 8.0 Nre Features
Linuxfest Northwest 2022 - MySQL 8.0 Nre Features
 
MySQL Indexes and Histograms - RMOUG Training Days 2022
MySQL Indexes and Histograms - RMOUG Training Days 2022MySQL Indexes and Histograms - RMOUG Training Days 2022
MySQL Indexes and Histograms - RMOUG Training Days 2022
 
MySQL 8.0 Features -- Oracle CodeOne 2019, All Things Open 2019
MySQL 8.0 Features -- Oracle CodeOne 2019, All Things Open 2019MySQL 8.0 Features -- Oracle CodeOne 2019, All Things Open 2019
MySQL 8.0 Features -- Oracle CodeOne 2019, All Things Open 2019
 
Windowing Functions - Little Rock Tech fest 2019
Windowing Functions - Little Rock Tech fest 2019Windowing Functions - Little Rock Tech fest 2019
Windowing Functions - Little Rock Tech fest 2019
 
MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019
MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019
MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019
 
Develop PHP Applications with MySQL X DevAPI
Develop PHP Applications with MySQL X DevAPIDevelop PHP Applications with MySQL X DevAPI
Develop PHP Applications with MySQL X DevAPI
 
MySQL 8 Tips and Tricks from Symfony USA 2018, San Francisco
MySQL 8 Tips and Tricks from Symfony USA 2018, San FranciscoMySQL 8 Tips and Tricks from Symfony USA 2018, San Francisco
MySQL 8 Tips and Tricks from Symfony USA 2018, San Francisco
 
The Proper Care and Feeding of MySQL Databases
The Proper Care and Feeding of MySQL DatabasesThe Proper Care and Feeding of MySQL Databases
The Proper Care and Feeding of MySQL Databases
 
MySQL without the SQL -- Cascadia PHP
MySQL without the SQL -- Cascadia PHPMySQL without the SQL -- Cascadia PHP
MySQL without the SQL -- Cascadia PHP
 
MySQL 8 Server Optimization Swanseacon 2018
MySQL 8 Server Optimization Swanseacon 2018MySQL 8 Server Optimization Swanseacon 2018
MySQL 8 Server Optimization Swanseacon 2018
 
MySQL Without The SQL -- Oh My! PHP[Tek] June 2018
MySQL Without The SQL -- Oh My! PHP[Tek] June 2018MySQL Without The SQL -- Oh My! PHP[Tek] June 2018
MySQL Without The SQL -- Oh My! PHP[Tek] June 2018
 
Presentation Skills for Open Source Folks
Presentation Skills for Open Source FolksPresentation Skills for Open Source Folks
Presentation Skills for Open Source Folks
 
MySQL Without the SQL -- Oh My! Longhorn PHP Conference
MySQL Without the SQL -- Oh My!  Longhorn PHP ConferenceMySQL Without the SQL -- Oh My!  Longhorn PHP Conference
MySQL Without the SQL -- Oh My! Longhorn PHP Conference
 
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)
 
ConFoo MySQL Replication Evolution : From Simple to Group Replication
ConFoo  MySQL Replication Evolution : From Simple to Group ReplicationConFoo  MySQL Replication Evolution : From Simple to Group Replication
ConFoo MySQL Replication Evolution : From Simple to Group Replication
 
Advanced MySQL Query Optimizations
Advanced MySQL Query OptimizationsAdvanced MySQL Query Optimizations
Advanced MySQL Query Optimizations
 
Making MySQL Agile-ish
Making MySQL Agile-ishMaking MySQL Agile-ish
Making MySQL Agile-ish
 
PHP Database Programming Basics -- Northeast PHP
PHP Database Programming Basics -- Northeast PHPPHP Database Programming Basics -- Northeast PHP
PHP Database Programming Basics -- Northeast PHP
 
MySQL 101 PHPTek 2017
MySQL 101 PHPTek 2017MySQL 101 PHPTek 2017
MySQL 101 PHPTek 2017
 

Recently uploaded

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 

Recently uploaded (20)

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 

SQL for PHP Programmers -- Dallas PHP Users Group Jan 2015

  • 1. Tutorial Day: Nov 11th, 9:00a - 12:30p Dave Stokes MySQL Community Manager David.Stokes@Oracle.com @Stoker SQL For PHP ProgrammersSQL For PHP Programmers
  • 2. 2 Safe Harbor StatementSafe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decision. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.
  • 3. 3 The Problem with PHP ProgrammersThe Problem with PHP Programmers Your are up to date on the latest version of PHP
  • 4. 4 The Problem with PHP ProgrammersThe Problem with PHP Programmers Your are up to date on the latest version of PHP The latest version of Javascript – no problemo!
  • 5. 5 The Problem with PHP ProgrammersThe Problem with PHP Programmers Your are up to date on the latest version of PHP The latest version of Javascript – no problemo! Frameworks – you know two or three or more – plus the ones you wrote yourself
  • 6. 6 The Problem with PHP ProgrammersThe Problem with PHP Programmers Your are up to date on the latest version of PHP The latest version of Javascript – no problemo! Frameworks – you know two or three or more – plus the ones you wrote yourself But roughly 2-3% have had any training in Structured Query Language (SQL)
  • 7. 7 So what is SQL?!?!??!?!??!??! http://en.wikipedia.org/wiki/SQL SQL (/ˈɛs kjuː ˈɛl/ or /ˈsiːkwəl/; Structured Query Language) is a special-purpose programming language designed for managing data held in a relational database management system (RDBMS), or for stream processing in a relational data stream management system (RDSMS). Originally based upon relational algebra and tuple relational calculus, SQL consists of a data definition language and a data manipulation language. The scope of SQL includes data insert, query, update and delete, schema creation and modification, and data access control.
  • 8. 8 Oh Crap!!! He Said 'relational algebra' and 'tuple relational calculus'!
  • 10. 10 Relational algebra http://en.wikipedia.org/wiki/Relational_algebra Relational algebra is a family of algebra with a well- founded semantics used for modelling the data stored in relational databases, and defining queries on it. To organize the data, first the redundant data and repeating groups of data are removed, which we call normalized. By doing this the data is organized or normalized into what is called first normal form (1NF). Typically a logical data model documents and standardizes the relationships between data entities (with its elements). A primary key uniquely identifies an instance of an entity, also known as a record.
  • 11. 11 Relation Algebra Continued Once the data is normalized and in sets of data (entities and tables), the main operations of the relational algebra can be performed which are the set operations (such as union, intersection, and cartesian product), selection (keeping only some rows of a table) and the projection (keeping only some columns). Set operations are performed in the where statement in SQL, which is where one set of data is related to another set of data.
  • 12. 12 Database Normalization Forms 1nf – No columns with repeated or similar data – Each data item cannot be broken down further – Each row is unique (has a primary key) – Each filed has a unique name 2nf – Move non-key attributes that only depend on part of the key to a new table ● Ignore tables with simple keys or no no-key attributes 3nf – Move any non-key attributes that are more dependent on other non-key attributes than the table key to a new table. ● Ignore tables with zero or only one non-key attribute
  • 13. 13 In more better English, por favor! 3NF means there are no transitive dependencies. A transitive dependency is when two columnar relationships imply another relationship. For example, person -> phone# and phone# -> ringtone, so person -> ringtone – A → B – It is not the case that B → A – B → C
  • 14. 14 And the rarely seen 4nf & 5nf You can break the information down further but very rarely do you need to to 4nf or 5nf
  • 15. 15 So why do all this normalization? http://databases.about.com/od/specificproducts/a/normalization.htm Normalization is the process of efficiently organizing data in a database. There are two goals of the normalization process: eliminating redundant data (for example, storing the same data in more than one table ) and ensuring data dependencies make sense (only storing related data in a table). Both of these are worthy goals as they reduce the amount of space a database consumes and ensure that data is logically stored.
  • 16. 16 Example – Cars Name Gender Color Model Heather F Blue Mustang Heather F White Challenger Eli M Blue F-type Oscar M Blue 911 Dave M Blue Mustang There is redundant information across multiple rows but each row is unique
  • 17. 17 2nf – split into tables Name Gender Heather F Eli M Oscar M Dave M Color Model Owner Blue Mustang Heather White Challenger Heather Blue F-type Eli Blue 911 Oscar Blue Mustang Dave Split data into two tables – one for owner data and one for car data
  • 18. 18 3nf – split owner and car info into different tables Car_ID Color Model Owner _ID 1 Blue Mustang 1 2 White Challenger 1 3 Blue F-type 2 4 Blue 911 3 5 Blue Mustang 4 The car info is separated from the car info. Note that the car table has a column for the owner's ID from the owner table. Owner_ID Name Gender 1 Heather F 2 Eli M 3 Oscar M 4 Dave M
  • 19. 19 But what if White Mustang is shared or 4nf Owner_ID Name Gender 1 Heather F 2 Eli M 3 Oscar M 4 Dave M Car_id Model Color 1 Mustang Blue 2 Challenger White 3 F-type Blue 4 911 Blue Car_id Owner_id 1 1 2 1 3 2 4 3 1 4 Tables for Owner, Car, & Ownership data Now we have a flexible way to search data about owners, cars, and their relations.
  • 20. 20 So now what!!! By normalizing to 3nf (or 4th ), we are storing the data with no redundancies (or very, very few) Now we need a way to define how the data is stored And a way to manipulate it.
  • 21. 21 SQL SQL is a declarative language made up of – DDL – Data Definition Language – DML – Data Manipulation Language SQL was one of the first commercial languages for Edgar F. Codd's relational model, as described in his influential 1970 paper, "A Relational Model of Data for Large Shared Data Banks." --Wikipedia – Codd, Edgar F (June 1970). "A Relational Model of Data for Large Shared Data Banks". Communications of the ACM (Association for Computing Machinery) 13 (6): 377–87. doi:10.1145/362384.362685. Retrieved 2007-06-09.
  • 23. 23 SQL is declarative Describe what you want, not how to process Hard to look at a query to tell if it is efficient by just looks Optimizer picks GPS-like best route – Can pick wrong – traffic, new construction, washed out roads, and road kill! Oh my!!
  • 24. 24 SQL is made up of two parts Data Definition Language (DDL) – For defining data structures ● CREATE, DROP, ALTER, and RENAME Data Manipulation Language – Used to SELECT, INSERT, DELETE, and UPDATE data
  • 25. 25 DDL Useful commands – DESC[ribe] table – SHOW CREATE TABLE table –
  • 26. 26 The stuff in the parenthesis CHAR(30) or VARCHAR(30) will hold strings up to 30 character long. – SQL MODE (more later) tells server to truncate or return error if value is longer that 30 characters – INT(5) tells the server to show five digits of data DECIMAL(5,3) stores five digits with two decimals, i.e. -99.999 to 99.999 FLOAT(7,4) -999.9999 to 999.9999
  • 27. 27 Another look at DESC City
  • 28. 28 NULL No Value Null is used to indicate a lack of value or no data – Gender : Male, Female, NULL Nulls are very messy in B-tree Indexing, try to avoid Math with NULLs is best avoided
  • 29. 29 DESC City in detail Describe table tells us the names of the columns (Fields), the data type, if the column is NULLABLE, Keys, any default value, and Extras.
  • 30. 30 Data Types Varies with vendor Usually have types for text, integers, BLOBs, etc. Refer to manual
  • 31. 31 MySQL World Database http://dev.mysql.com/doc/index-other.html Used in MySQL documentation, books, on line tutorials, etc. Three tables – City – Country – Country Language
  • 34. 34 Join two tables To get a query that provides the names of the City and the names of the countries, JOIN the two tables on a common data between the two columns (that are hopefully indexed!)
  • 36. 36 Simple join Both City and Country have columns that can be used for JOINs – Country.Code – City.CountryCode
  • 37. 37 What happened when you send a query Server receives the query The user is authenticated for permissions – Database, table, and/or column level Syntax Optimizer – Statistics on data – Cost model ● Pick cheapest option (DISK I/O) ● Cardinality of indexes Get the data Sorting/Grouping/etc Data returned
  • 38. 38 EXPLAIN EXPLAIN is pre pended to the query to show the results from the optimizer
  • 39. 39 VISUAL Explain MySQL Workbench MySQL 5.6/5.7 Uses JSON output from EXPLAIN and turns it into something more visually appealing
  • 40. 40 Optimizer Trace { "query_block": { "select_id": 1, "cost_info": { "query_cost": "5779.32" }, "nested_loop": [ { "table": { "table_name": "City", "access_type": "ALL", "rows_examined_per_scan": 4079, "rows_produced_per_join": 4079, "filtered": 100, "cost_info": { "read_cost": "68.72", "eval_cost": "815.80", "prefix_cost": "884.52", "data_read_per_join": "286K" }, "used_columns": [ "Name", "CountryCode", "Population" ] } }, { "table": { "table_name": "Country", "access_type": "eq_ref", "possible_keys": [ "PRIMARY" ], "key": "PRIMARY", "used_key_parts": [ "Code" ], "key_length": "3", "ref": [ "world.City.CountryCode" ], "rows_examined_per_scan": 1, "rows_produced_per_join": 4079, "filtered": 100, "cost_info": { "read_cost": "4079.00", "eval_cost": "815.80", "prefix_cost": "5779.32", "data_read_per_join": "1M" }, "used_columns": [ "Code", "Name"
  • 41. 41 MySQL Internals Manual :: 7 The Optimizer :: 7.1 Code and Concepts :: 7.1.2 The Optimizer Code handle_select() mysql_select() JOIN::prepare() setup_fields() JOIN::optimize() /* optimizer is from here ... */ optimize_cond() opt_sum_query() make_join_statistics() get_quick_record_count() choose_plan() /* Find the best way to access tables */ /* as specified by the user. */ optimize_straight_join() best_access_path() /* Find a (sub-)optimal plan among all or subset */ /* of all possible query plans where the user */ /* controls the exhaustiveness of the search. */ greedy_search() best_extension_by_limited_search() best_access_path() /* Perform an exhaustive search for an optimal plan */ find_best() make_join_select() /* ... to here */ JOIN::exec()
  • 42. 42 Data and Data types Use the smallest reasonable field – BIGINT are not needed for customer id numbers – Signed/unsigned ● Customer “negative seven four two four three” – All those extra bits have to be moved disk → memory → buffer → Ether → buffer → program CHAR versus VARCHAR – Space and compression – Overhead slight ENUMs and BITs – Have to plan ahead – Sorting issues ● Value of the ENUM
  • 43. 43 INTEGER, INT, SMALLINT, TINYINT, MEDIUMINT, BIGINT Type Storage (bytes) Minimum Signed Maximum Signed Minimum Unsigned Maximum Unsigned TINYINT 1 -128 127 0 255 SMALLINT 2 -32768 32767 0 65535 MEDIUMINT 3 -8388608 8388607 0 16777215 INT 4 - 2147483648 2147483647 0 4294967295 BIGINT 8 - 9223372036 854775808 9223372036 854775807 0 1844674407 3709551615
  • 44. 44 Creating a table with MySQL command line client CREATE TABLE fooint (a INT(1), b INT(4), c iINT(10)); INSERT INTO fooint (a,b,c) VALUES (1,100,10000); INSERT INTO fooint VALUES (777,777,0); SELECT * from foointg INSERT INTO fooint (a) values (1234567890); SELECT * FROM foointG Note that ; and g are equivalent and G is for vertical output. What happens with INSERT INTO fooint (a) values (12345678900); ??
  • 45. 45 ALTER, TRUNCATE, and DROP table ALTER TABLE fooint ADD COLUMN id INT UNSIGNED NOT NULL FIRST; – Also BEFORE and AFTER SELECT * FROM fooint; TRUNCATE fooint; – Used to remove data but not schema definition DROP TABLE fooint; – Goodbye to table and schema
  • 46. 46 Another table CREATE TABLE fooint (id INT UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT, name Char(30)); DESC fooint; and SHOW CREATE TABLE foointl INSERT INTO fooint (name) VALUES ('Alpha'),('Beta'), ('Gamma'); SELECT * FROM fooint; – Note the id column incremented automatically
  • 47. 47 Foreign Keys CREATE TABLE employee ( e_id INT NOT NULL, name CHAR(20), PRIMARY KEY (e_id) ); CREATE TABLE building ( office_nbr INT NOT NULL, description CHAR(20), e_id INT NOT NULL, PRIMARY KEY (office_nbr), FOREIGN KEY (e_id) REFERENCES employee (e_id) ON UPDATE CASCADE ON DELETE CASCADE);
  • 48. 48 More on foreign keys mysql> INSERT INTO employee VALUES (10,'Larry'), (20,'Shemp'), (40,'Moe'); Query OK, 3 rows affected (0.04 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql> INSERT INTO building VALUES (100,'Corner Office',10), (101,'Lobby',40); Query OK, 2 rows affected (0.04 sec) Records: 2 Duplicates: 0 Warnings: 0 mysql> SELECT * FROM employee; +------+-------+ | e_id | name | +------+-------+ | 10 | Larry | | 20 | Shemp | | 40 | Moe | +------+-------+ 3 rows in set (0.00 sec) mysql> SELECT * FROM building; +------------+---------------+------+ | office_nbr | description | e_id | +------------+---------------+------+ | 100 | Corner Office | 10 | | 101 | Lobby | 40 | +------------+---------------+------+ 2 rows in set (0.00 sec)
  • 49. 49 Using foreign keys mysql> SELECT * FROM employee JOIN building ON (employee.e_id=building.e_id); +------+-------+------------+---------------+------+ | e_id | name | office_nbr | description | e_id | +------+-------+------------+---------------+------+ | 10 | Larry | 100 | Corner Office | 10 | | 40 | Moe | 101 | Lobby | 40 | +------+-------+------------+---------------+------+ 2 rows in set (0.02 sec)
  • 50. 50 Left Join with foreign keys mysql> SELECT * FROM employee LEFT JOIN building ON (employee.e_id=building.e_id); +------+-------+------------+---------------+------+ | e_id | name | office_nbr | description | e_id | +------+-------+------------+---------------+------+ | 10 | Larry | 100 | Corner Office | 10 | | 40 | Moe | 101 | Lobby | 40 | | 20 | Shemp | NULL | NULL | NULL | +------+-------+------------+---------------+------+ 3 rows in set (0.00 sec)
  • 51. 51 Foreign keys keep you from messy data mysql> INSERT INTO building VALUES (120,'Cubicle',77); ERROR 1452 (23000): Cannot add or update a child row: a foreign key constraint fails (`test`.`building`, CONSTRAINT `building_ibfk_1` FOREIGN KEY (`e_id`) REFERENCES `employee` (`e_id`) ON DELETE CASCADE ON UPDATE CASCADE) mysql>
  • 52. 52 Taking advantage of CASCADE mysql> DELETE FROM employee WHERE e_id=40; Query OK, 1 row affected (0.08 sec) mysql> SELECT * FROM employee LEFT JOIN building ON (employee.e_id=building.e_id); +------+-------+------------+---------------+------+ | e_id | name | office_nbr | description | e_id | +------+-------+------------+---------------+------+ | 10 | Larry | 100 | Corner Office | 10 | | 20 | Shemp | NULL | NULL | NULL | +------+-------+------------+---------------+------+
  • 53. 53 Cascade keeping foreign key data updated mysql> UPDATE employee SET e_id=21 WHERE e_id=20; Query OK, 1 row affected (0.04 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> SELECT * FROM employee LEFT JOIN building ON (employee.e_id=building.e_id); +------+-------+------------+---------------+------+ | e_id | name | office_nbr | description | e_id | +------+-------+------------+---------------+------+ | 10 | Larry | 100 | Corner Office | 10 | | 21 | Shemp | NULL | NULL | NULL | +------+-------+------------+-------------
  • 54. 54 Indexing MySQL uses mainly B-trees for indexes Great for =, <, >, <=, =>, BETWEEN, some LIKE 'string%'
  • 55. 55 13.1.8 CREATE INDEX Syntax CREATE [UNIQUE|FULLTEXT|SPATIAL] INDEX index_name [index_type] ON tbl_name (index_col_name,...) [index_option] [algorithm_option | lock_option] ... index_col_name: col_name [(length)] [ASC | DESC] index_type: USING {BTREE | HASH} index_option: KEY_BLOCK_SIZE [=] value | index_type | WITH PARSER parser_name | COMMENT 'string' algorithm_option: ALGORITHM [=] {DEFAULT|INPLACE|COPY} lock_option: LOCK [=] {DEFAULT|NONE|SHARED|EXCLUSIVE}
  • 56. 56 You may not know Can create multi-column indexes – Year,month,day index can be used for ● Year,Month,Day searches ● Year,Month searches ● Year searches – But not day or month, day searches Can index on part of a column – CREATE INDEX ndx1 ON customer (name(10)); NULLs are to be avoided – Extra optimizer steps UNIQUE indexes enforce no duplicates InnoDB will create an index if YOU DO NOT
  • 57. 57 13.2.5.2 INSERT … ON DUPLICATE KEY UPDATE If you specify ON DUPLICATE KEY UPDATE, and a row is inserted that would cause a duplicate value in a UNIQUE index or PRIMARY KEY, an UPDATE of the old row is performed. For example, if column a is declared as UNIQUE and contains the value 1, the following two statements have identical effect: INSERT INTO table (a,b,c) VALUES (1,2,3) ON DUPLICATE KEY UPDATE c=c+1;
  • 58. 58 FIND Asian countries with Population > 5 million EXPLAIN SELECT * FROM Country WHERE Continent = 'Asia' AND Population > 5000000;
  • 59. 59 Will indexes make query faster? SELECT * FROM Country WHERE Continent = 'Asia' AND Population > 5000000; Index columns – Population – Continent – Population and Continent – Continent and Population ● Which to pick? Optimizer will look at option that takes the least CPU and I/O ● Statistics from storage engine – Indexes to use, order to join tables, avoiding sorts, is sorting expensive
  • 60. 60 Index on Population Alter TABLE Country ADD INDEX p (Population) Down from 239!!
  • 61. 61 A reading from the Book of MySQL – Chapter 8 Optimization READ this section of the MySQL Manual – Hardware – Configuration – Queries – Locking – Benchmarkings Set InnoDb Buffer Cache Size to 75-80% of system RAM Move Logs off disks/controllers with data Use bigger stronger machines for replication slaves Looking at SSDs, FusionIO cards Monitor! Monitor! Monitor!
  • 62. 62 ALTER TABLE Country ADD INDEX C (continent) 42 is better than 239 or 54
  • 63. 63 CREATE INDEX cpidx ON Country (continent, Population) Event better still!
  • 64. 64 CREATE INDEX pcidx on Country (Population,continent) FORCING the use of the new index shows it is not optimal.
  • 65. 65 Index-es Indexes need maintenance – Run OPTIMIZE TABLE when system quiet – Each INSERT/DELETE takes overhead ● Slows you down – Therefore remove unused indexes – MySQL Utilities mysqlindexcheck to look for unused index on long running systems (do not use AFTER a restart) ● Use good naming convention, document – Statistics can be saved/reloaded at shutdown/reboot ● After a reboot w/o saving, the system is going to need to rebuild stats from scratch, run slower – Log queries not using Indexes ● Not all of these are bad, just recognize them
  • 67. 67 Transactions You will need to use a transactional storage engine like InnoDB, NDB You need to START a transaction, do the work and COMMIT to record the changes or ROLLBACK to cancel the recording. To avoid using ROLLBACK, you can employ the following strategy: – Use LOCK TABLES to lock all the tables you want to access. – Test the conditions that must be true before performing the update. – Update if the conditions are satisfied. – Use UNLOCK TABLES to release your locks. Note – This solution does not handle the situation when someone kills the threads in the middle of an update. In that case, all locks are released but some of the updates may not have been executed.
  • 68. 68 CREATE TABLE account (id int not null, balance decimal(6,2) default 0) INSERT INTO account VALUES (1,1000.10), (2,400), (3,0), (15,.99); START TRANSACTION; UPDATE account SET balance = 1010.10 WHERE id = 1; UPDATE account SET balance = 300 WHERE id=2; COMMIT; START TRANSACTION UPDATE account SET balance=1000 WHERE ID=3; ROLLBACK;
  • 69. 69 AUTOCOMMIT AUTOCOMMIT is set to 1 by default or 'do things as I type them mode'. – START TRANSACTION overrides – Some APIs like JDBC have own way to handle transactions (see chapter 23 in MySQL docs) AUTOCOMMIT set to 0 requires a COMMIT to store changes Some statements cannot be rolled back. In general, these include data definition language (DDL) statements, such as those that create or drop databases, those that create, drop, or alter tables or stored routines Some statements cause an implicit commit – DDL, user account changes, transaction control or locking statements, data loading, and replication control statements,
  • 70. 70 SET autocommit=0; LOCK TABLES t1 WRITE, t2 READ, ...; ... do something with tables t1 and t2 here ... COMMIT; UNLOCK TABLES; Note ROLLBACKS do not release locks
  • 71. 71 SET autocommit=0; LOCK TABLES t1 WRITE, t2 READ, ...; ... do something with tables t1 and t2 here ... COMMIT; UNLOCK TABLES; Note ROLLBACKS do not release locks
  • 73. 73 First steps of scaling ✔ Upgrade MySQL versions ✔ 5.5 is 25% faster than 5.1 ✔ 5.6 is 20% faster ✔ 5.7 will be faster still ✔ Memory ✔ DISKS ✔ Move logs to different platters, controllers Move large databases to own disks ✔ TUNE queries ✔ Scale horizontally especially is server is old ✔ Use bigger, more powerful boxes for replicants ✔ Optimize network
  • 75. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.75 MySQL Fabric HA, shards, Write scaling ● Lets you shard and reshard ● Very new ● Looking for feedback ● Released in May
  • 76. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.76 Why does that query take so LONG!!! ● I only want the top N customers! ● And I want them sorted by last name, first name, middle name, postal code, and average purchase! The stuff on the right hand side of the WHERE clause may seem like a great way to prune down the number of records for a query BUT that is not always the case.
  • 77. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.77 Query we want to 'prune' Select City.name AS 'City', Country.name AS 'Country', City.Population AS 'Pop' FROM City JOIN Country ON (City.CountryCode = Country.Code)
  • 78. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.78 The Stats! Total cost is about 5780. Bad news is that we have a Full Table Scan on City but that was expected. Good news is that we have 1 read on Country using the Primary key to get record we need to match on the join.
  • 79. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.79 Now we only want the top ten! LIMIT 10 But the cost is still 5780!?!?! The server has to run through ALL the records to get just the first ten. So LIMIT does not buy us any speed in this case.
  • 80. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.80 ORDER BY Country Oh no! The cost jumps to 9858! And we have to open a temp table and then sort that file (expensive)
  • 81. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.81 WHERE City.Population > 1000000 Nope. The server is doing the join first and then tossing away the records that are <= 1000000
  • 82. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.82 SELECT a.name FROM (select City.name, CountryCode, Population from City where Population > 1000000) as a
  • 83. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.83 SELECT a.name FROM (select City.name, CountryCode, Population from City where Population > 1000000) as a JOIN Country ON (a.CountryCode = Country.code) That is an awful lot of work to try to save a few hundred and not very pretty.
  • 84. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.84 The World is too small ● The World database is too small and already too optimized to gain much more performance. ● So lets create a new database and copy the date to get 100K+ records; – CREATE TABLE bigcity LIKE city; – INSERT INTO bigcity SELECT * FROM city; INSERT INTO bigcity SELECT NULL,Name,CountryCode,District,Population FROM bigcity; <many times>
  • 85. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.85 SELECT countrycode, sum(Population) FROM bigcity GROUP BY CountryCode; ● Takes .343 seconds ● ALTER TABLE bigcity ADD INDEX cc_pop (CountryCode,Population); ● Now takes .156 seconds
  • 86. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.86 Zipcode – from a free zipcode database SELECT city, state, zipcode FROM zip WHERE zipcode = '76262'; Pretend out app provides the us postal code given the city and state data. Here no indexes.
  • 87. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.87 Indexing city and state only. ● CREATE INDEX csidx ON zip (city(8),state(2)); /* only use first 8 on name and 2 for state*/ ● SELECT city, state, zipcode FROM zip WHERE city='Roanoke' and state='TX';
  • 88. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.88 Good optimization The optimizer uses the city and state from the index we created to know where to dive into the data. Some indexes will be able to supply all the data so there will be no need to dive into the data.
  • 89. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.89 Composite and Compound Indexes ● Composite index covers all the columns in the filter, join and select criteria. All of these columns are stored on all of the index pages accordingly in the index B-tree. ● Compound index covers all the filter and join key columns in the B-tree, keeping the select columns only on the leaf pages as they will not be searched, only extracted! This saves space and creates smaller index pages (and is faster).
  • 90. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.90 NoSQL ● No Schema – Teenager with all belongings in center of room – Does not scale easily – ACID compliance not important – Effective for some jobs but watch out if you need an AND or an OR
  • 91. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.91 http://en.wikipedia.org/wiki/NoSQL A NoSQL (often interpreted as Not Only SQL) database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Motivations for this approach include simplicity of design, horizontal scaling and finer control over availability. The data structure (e.g. key-value, graph, or document) differs from the RDBMS, and therefore some operations are faster in NoSQL and some in RDBMS. There are differences though, and the particular suitability of a given NoSQL DB depends on the problem it must solve (e.g., does the solution use graph algorithms?).
  • 92. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.92 Key/value pair ● A key-value pair (KVP) is a set of two linked data items: a key, which is a unique identifier for some item of data, and the value, which is either the data that is identified or a pointer to the location of that data. http://searchenterprisedesktop.techtarget.com/definition/key-value-pair ● Berkeley Database (Sleepycat) 1986 – Hash table
  • 93. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.93 Graph Databases ● In computing, a graph database is a database that uses graph structures with nodes, edges, and properties to represent and store data. A graph database is any storage system that provides index- free adjacency. This means that every element contains a direct pointer to its adjacent elements and no index lookups are necessary http://en.wikipedia.org/wiki/Graph_database
  • 94. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.94 Document Databases A document-oriented database is a computer program designed for storing, retrieving, and managing document-oriented information, also known as semi- structured data. Document-oriented databases are one of the main categories of NoSQL databases and the popularity of the term "document-oriented database" (or "document store") has grown with the use of the term NoSQL itself. In contrast to relational databases and their notions of"Tables" and "Relations" (between Tables), these systems are designed around an abstract notion of a "Document". http://en.wikipedia.org/wiki/Document-oriented_database
  • 95. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.95 Hybrids
  • 96. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.96 Bypassing syntax checking and optimizer upto 9x faster than SQL
  • 97. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.97 Paradigm Shifting w/o a clutch ● SQL is a declarative language ● PHP is procedural/object orientated ● Square Pegs for Round Holes
  • 98. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.98 Object Relation Mapper ● Adds another layer – Easier to learn SQL – Not as expressive as SQL – ORMs are a hammer in a world of nails and blasting caps – If you can not solve problem with an object, adding another object is not going to help
  • 99. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.99 Query Rewrite plugin or How To Fix your ORM's lousy SQL mysql < <plugin directory>/install_rewriter_plugin.sql INSERT INTO query_rewrite.rewrite_rules( pattern, replacement ) VALUES ( 'SELECT ?', 'SELECT ? + 1' );
  • 100. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.100 Better Example ● SELECT patientId, time FROM patient_records WHERE patientId > 1000 AND patientId < 2000 AND time < “2014-09-08” AND illness = ”Headache” ORDER BY time LIMIT 1; ● SELECT patiendId, time FROM patient_records WHERE patientId > 1000 AND patientID < 2000 AND time < “2014-09-08” AND illness = ”Acrocephalosyndactylia” * ORDER BY time LIMIT 1;
  • 101. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.101 The Rules for the rewrite INSERT INTO query_rewrite.rewrite_rules ( pattern, pattern_database, replacement ) VALUES ( # Pattern 'SELECT patientid, time ' || 'FROM patient_records ' || 'WHERE patientid > ? AND patientid < ? ' || 'AND time < ? AND illness = ? ' || 'ORDER BY time LIMIT 1', # Database 'health_clinic', # Replacement 'SELECT patientid, time ' || 'FROM patient_records FORCE INDEX(patientIdIdx) ' || 'WHERE patientid > ? AND patientid < ? ' || 'AND time < ? AND illness = ? ' || 'ORDER BY time LIMIT 1');
  • 102. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.102 Recommended Books ● High Performance MySQL, 3rd Edition Baron Schwartz, Peter Zaitsev & Vadim Tkachenko ● SQL Antipatterns – Bill Karwin ● SQL and Relational Theory – CJ Date ● Effective MySQL Replication Techniques in Depth – Ronald Bradford
  • 103. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.103 Q/A ● Slideshare.net/davestokes ● @stoker ● David.Stokes@Oracle.com