Solving performance problems in MySQL without denormalization

Solving Performance Problems in MySQL Without Denormalization

RENORMALIZE

Akiban Technologies, Inc. Confidential & Proprietary

Problem Statement

Schemas scale out

Data volume grows

Joins become a real bottleneck

Akiban Technologies, Inc. Confidential & Proprietary 2

Two Common Manifestations
SQL Joins
Queries become slower as more tables are
joined.

Application Object Creations
Constructing an object is as expensive as
SELECTing the sum of its parts

Denormalize. Problem solved.


Application Growing Pains
Web Cache
Server Server

V6 Release
V5
V4
V3
V1
V2
Rip & ReplaceDB
Shard Database
Add Customers!
Get Caching
Replicate DB
De-normalize

Complexity & Cost
Customers

MySQL

Rip & Replace Database Architecture
MySQL MySQL
Slaves

MySQL Sharding
?
MySQL

Time

4

De·nor·mal·ize
[de-nawr-muh-lahyze]
verb, -ized, -iz·ing.
–verb (used with object)
1.  the process of attempting to optimize the read
performance of a database by adding redundant
data or by grouping data wikipedia

2.  Denormalize means to allow redundancy in a
table so that the table can remain flat UCSD Blink

3.  The process of restructuring a normalized data
model to accommodate operational constraints or
system limitations celiang.tongji.edu.cn


Materialized Views
Persistent database object
Contains the results of a query
Store summary and pre-joined tables
Require maintenance/refresh for dynamic data
SELECT
DISTINCT(n.nid),n.sticky,n.title,n.created
FROM node n
INNER JOIN term_node tn0
ON n.vid = tn0.vid
WHERE n.status = 1
AND tn0.tid IN (77)
ORDER BY n.sticky DESC, n.created DESC
LIMIT 0, 25;

Result: using where, using filesort

Drupal Materialized View Project
CREATE TABLE `mv_drupalorg_node_by_term` (
èntity_type` varchar(64) NOT NULL,
èntity_id` int(10) unsigned NOT NULL DEFAULT '0’,
`term_tid` int(10) unsigned NOT NULL DEFAULT '0',
`node_sticky` int(11) NOT NULL DEFAULT '0',
`last_node_activity` int(11) NOT NULL DEFAULT '0',
`node_created` int(11) NOT NULL DEFAULT '0',
`node_title` varchar(255) NOT NULL DEFAULT '’,
PRIMARY KEY (èntity_type`,èntity_id`,`term_tid`),
KEY àctivity`
(`term_tid`,`node_sticky`,`last_node_activity`,`node_created`),
KEY `creation` (`term_tid`,`node_sticky`,`node_created`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

SELECT DISTINCT entity_id AS nid, node_sticky AS sticky, node_title
AS title,
node_created AS created
FROM mv_drupalorg_node_by_term
WHERE term_tid IN (77)
ORDER BY node_sticky DESC, node_created DESC
LIMIT 0, 25;

Result: using where, using temporary table


Denormalization Technique Listing
Technique Pros Cons

Materialized views Faster queries (no joins) Data explosion
Manually keep synched

Store object as Blob Fast object get No modeling, or querying

Denormalize 1NF: Folding Data in one row limited # of child rows
parent-child into parent table Hard to query (UNION hell)

Denormalize 2NF to 1NF: repeat Avoid join Data explosion
columns from 1 table in M table Manually keep synched
(Double writing)

Adding derived columns Avoid joins, aggregation Manually keep synched

Property bag (RDF) Schema flexibility Manage schema in app
Akiban Technologies, Inc. Confidential & Proprietary Hard to index or perform 8

Renormalization

Join for free
- Improved performance. 10-100x!
- Retrieve an object in one request


Introduction to Table-Groups
Traditional SQL
Schema à Table à Column

Akiban newSQL
Schema à GROUP à Table à Column

Table-Groups are first class citizens


Typical Relational DB Schema


Typical Schema: Grouped

Block
Group

User
Group

Node Group 12

Table-Groups Eliminate Joins
Logical

Physical
Users Users_Roles Sessions
Artist Table-group
uid name pass
id rid
id sid timestamp

1 rriegel *** 1 1 1 19390 2011-10-01-06:02.00

2 twegner *** 1 2 2 22828 2011-10-04-22:32.10

2 1 1 49377 2011-10-04-16:07.30

Table Group
Table Table
bTree bTree bTree


Benefits of Table-grouping
SQL join operations are fast
-  Table Group access is equivalent to a
single table access. Joins are free!
-  Performance increases 10-100x

Applications do not change
-  Maintain the same tables and SQL
-  Objects (e.g. ORM) fetched in one request
-  Akiban uses standard MySQL replication


Design Partner Sample Query

SELECT t1.id , t3.c1,
t3.c2, t3.c3, t3.c4
FROM t1
INNER JOIN t2 on t2.id = t1.id
LEFT JOIN t3 ON t1.id = t3.id
WHERE t2.region in (1297789)
AND t1.c1 = '0'
ORDER BY t1.latestLogin DESC
LIMIT 500


Typical MySQL EXPLAIN Plan

10 Project Results

Sort 9

Temp Table 8

2 Joins 7

4 6 2 Table Accesses

2 3 5

1 3 Index Accesses


Efficiency for Speed and Scale

No Joins,
Project Results 3 Temp Tables or
Sorts!

1 Group Access 2

1 Group Index Access Typical MySQL EXPLAIN Project Results

Sort

Temp Table
1
2 Joins

2 Table Accesses

3 Index Accesses


Design Partner Acceleration: 27x

Concurrent Connections


Object Creation Query Stream

SELECT * FROM t1 Where u.uid=1387
SELECT * FROM t2 Where as.uid=1387
SELECT * FROM t3 Where os.uid=1387
SELECT * FROM t4 Where pm.uid=1387
SELECT * FROM t5 Where pl.uid=1387
SELECT * FROM t6 Where pa.uid=1387
...
...


Becomes Single ORM Request
SELECT * ,
(SELECT * FROM t2 where as.uid=u.uid),
(SELECT * FROM t3 where as.uid=u.uid),
...
FROM t1 Where u.uid=1387;

Or simply:

get my_schema:t1:uid=1387


Object Access in One Request


Application Integration

Data replicated to Akiban Fully independent server

HA Redirect Enabled
MySQL Master Akiban Server

MySQL adapter
Replication

MyISAM / InnoDB
Storage

Write Operations Problem Queries

Akiban is looking for Design Partners!

Do you have
•  Slow multi-join read queries?
•  User concurrency or data volume challenges?

http://www.akiban.com/design-partner-program


Ah, so you’re…
Denormalizing…no.
-  Schema doesn’t change
-  Data is stored once, more efficiently
Materializing Views…no.
-  No triggers or post-processing
-  No 2ndary logical objects
Introducing Write Latency…no.
-  Previous design partner showed 2x write
improvement


Table-Grouping: A Closer Look

Artist Each table maintains its own bTree
id name gender

Indexes add their own bTrees
1 Lennon M
•  Covering index
2 Joplin F
•  Index on frequently joined columns
Covering
•  Index on common sort order
Index
Join Cols
Index Sort
Order
Index
How many indexes do you maintain?
•  Slow updates == reduced concurrency
Table •  More resources == more overhead
bTree
•  Ongoing maintenance == high TCO


Solving performance problems in MySQL without denormalization

More Related Content

What's hot

Viewers also liked

Similar to Solving performance problems in MySQL without denormalization

Solving performance problems in MySQL without denormalization