© 2013 by Markus Winand
Indexes
The Neglected Performance
All-Rounder
Not
always that obvious,
unfortunately!
iStockPhoto
wildpixel
Takeaway #1: Pandemic Scale
It affects you!
(Symbolic image; not real data)
http://upload.wikimedia.org/wikipedia/commons/c/c7/2009_world_subdivisions_flu_pandemic.png
Takeaway #2: Caused by Success
Copyright © 2013 Telerik, Inc. All rights reserved
Takeaway #3: It’s Not Your Fault
http://simpsonswiki.com/wiki/File:I_Didn%27t_Do_It!_Volume_III.png
© 2013 by Markus Winand
The Problem
Improper Index Use
The Problem: Improper Index Use
“A very common cause of performance
problems is lack of proper indexes or the
use of queries that are not using
existing indexes.”
—Buda Consulting
http://www.budaconsulting.com/Portals/52677/docs/top_5_tech_brief.pdf
The Problem: Improper Index Use
“A very common cause of performance
problems is lack of proper indexes or the
use of queries that are not using
existing indexes.”
—Buda Consulting
http://www.budaconsulting.com/Portals/52677/docs/top_5_tech_brief.pdf
Quantifying the Problem
Percona White Paper:
Reasons of performance problems
that caused production downtime:
38% bad SQL
15% schema and indexing
http://www.percona.com/files/white-papers/causes-of-downtime-in-mysql.pdf
Quantifying the Problem
Survey by sqlskills.com:
Root causes of the last few SQL
Server performance problems:
27% T-SQL
19% Poor indexing
http://www.sqlskills.com/blogs/paul/survey-what-are-the-most-common-causes-of-performance-problems/
Quantifying the Problem
Craig S. Mullins (strategist and researcher):
„As much as 75% of poor relational performance
is caused by "bad" SQL and application code.”
Noel Yuhanna (Forrester Research):
„The key difficulties surrounding performance
continue to be poorly written SQL statements,
improper DBMS configuration and a lack of clear
understanding of how to tune databases to solve
performance issues.”
Quantifying the Problem
My observation:
~50% of SQL performance problems
are caused by improper index use
© 2013 by Markus Winand
The Root Cause
© 2013 by Markus Winand
The Root Cause
Admins are Indexing
The Root Cause: DBAs are Indexing
How did databases
work before SQL?
The Root Cause: DBAs are Indexing
Index use was intrinsically
tied to the queries.
The Root Cause: DBAs are Indexing
Example: dBase
Developers had to...
...use indexes explicitly when searching:
!"#$%&'"($#)$*+!#,&+-"
$$$.%&'$/%&+&'
...take care of index maintenance:
!"#$%&'"($#)$*+!#,&+-"0$%'(1
$$$$+22"&'
The Root Cause: DBAs are Indexing
SQL is an abstraction that only
defines the logical view.
The actual SQL implementation
takes care of everything else.
The Root Cause: DBAs are Indexing
Transactions
Constraints
Views
Tables
Data
manipulation
Queries
SQL (language)
has:
SQL Databases (software)
have:
The Root Cause: DBAs are Indexing
Backup
& recovery
Storage
management
Bugs &
patches
Tuning
parameters
Transactions
Constraints
Views
Tables
Data
manipulation
Queries
SQL (language)
has:
SQL Databases (software)
have:
High
Availability
The Root Cause: DBAs are Indexing
Indexes
Backup
& recovery
Storage
management
Bugs &
patches
Tuning
parameters
Transactions
Constraints
Views
Tables
Data
manipulation
Queries
SQL (language)
has:
SQL Databases (software)
have:
High
Availability
The Root Cause: DBAs are Indexing
Indexes
Backup
& recovery
Storage
management
Bugs &
patches
Tuning
parameters
Transactions
Constraints
Views
Tables
Data
manipulation
Queries
SQL Databases (software)
have:
Developers
High
Availability
The Root Cause: DBAs are Indexing
Indexes
Backup
& recovery
Storage
management
Bugs &
patches
Tuning
parameters
Transactions
Constraints
Views
Tables
Data
manipulation
Queries
Developers Administrators
High
Availability
The Root Cause: DBAs are Indexing
Indexing is considered a system
tuning task that belongs to the
administrators responsibilities.
The Root Cause: DBAs are Indexing
A misconception that causes new problems:
The Root Cause: DBAs are Indexing
A misconception that causes new problems:
DBAs don’t know
the queries
Have to “investigate”
to find the queries.
It is time consuming and
almost always incomplete.
by G-10gian82
deviantart.com
The Root Cause: DBAs are Indexing
A misconception that causes new problems:
DBAs don’t know
the queries
Have to “investigate”
to find the queries.
It is time consuming and
almost always incomplete.
DBAs can’t change
the queries
Can make the index
match the query.
Can’t make the query
match the index!
© 2013 by Markus Winand
The Solution
© 2013 by Markus Winand
The Solution
Indexing is a
Development Task
The Solution: It’s a Dev Task
Indexes
Backup
& recovery
Storage
management
Tuning
parameters
Transactions
Constraints
Views
Tables
Data
manipulation
Queries
Developers Administrators
High
Availability
Bugs &
patches
The Solution: It’s a Dev Task
Indexes
Backup
& recovery
Storage
management
Tuning
parameters
Transactions
Constraints
Views
Tables
Data
manipulation
Queries
Developers Administrators
Must match!
High
Availability
Bugs &
patches
Another Problem: It’s not Taught
Indexes are not part of the pure SQL (language) literature
because indexes are not part of the SQL standard.
11 SQL books analyzed: only 1.0% of the pages are
about indexes (70 out of 7330 pages).
Examples:
Oracle SQL by Example: 2.0% (19/960)
Beginning DBs with PostgreSQL: 0.8% (5/664)
Learning SQL: 3.3% (11/336 — highest rate in class)
Another Problem: It’s not Taught
Proper index usage is sometimes covered in database
tuning books but is always buried between hundreds of
pages of HW, OS and DB parameterization topics.
14 database administration books analyzed: 5.1% of the
pages are about indexes (307 out of 6069 pages).
Examples:
Oracle Performance Survival Guide: 5.2% (38/730)
High Performance MySQL: 8% (55/684)
PostgreSQL 9 High Performance: 5.8% (27/468)
Another Problem: It’s not Taught
Consequence:
Developers don’t know how to use
indexes properly.
Another Problem: It’s not Taught
Consequence:
Developers don’t know how to use
indexes properly.
Results of the 3-minute online quiz:
http://use-the-index-luke.com/3-minute-test
5 questions: each about a specific index
usage pattern.
Non-representative!
Q1: Good or Bad? (Function use)
345675$89:5;$#<*,%'($=9$#<*$>!"#$%&'()*+?@
A5B537$#"(#0$'+#",C)*D-&
$$E4=F$#<*
$/G545$,-%./012!"#$%&'()*+0$HIIIIH?$J$H1KLMH@
3-Minute Quiz: Indexing Skills
Q1: Good or Bad? (Function use)
345675$89:5;$#<*,%'($=9$#<*$>!"#$%&'()*+?@
A5B537$#"(#0$'+#",C)*D-&
$$E4=F$#<*
$/G545$,-%./012!"#$%&'()*+0$HIIIIH?$J$H1KLMH@
 


3-Minute Quiz: Indexing Skills
Q1: Good or Bad? (Function use)
345675$89:5;$#*,%'($=9$#*$!#$%'()*+?@
A5B537$#(#0$'+#,C)*D-
$$E4=F$#*
$/G545$,-%./012!#$%'()*+0$HIIIIH?$J$H1KLMH@
3-Minute Quiz: Indexing Skills
3-Minute Quiz: Indexing Skills
Q2: Good or Bad? (Indexed Top-N, no IOS)
345675$89:5;$#*,%'($=9$#*$+0$'+#,C)*?@
A5B537$%'0$+0$'+#,C)*
$$E4=F$#*
$/G545$+$J$NL
$=4:54$OI$'+#,C)*$:5A3
$3454,67@
3-Minute Quiz: Indexing Skills
Q2: Good or Bad? (Indexed Top-N, no IOS)
345675$89:5;$#*,%'($=9$#*$+0$'+#,C)*?@
A5B537$%'0$+0$'+#,C)*
$$E4=F$#*
$/G545$+$J$NL
$=4:54$OI$'+#,C)*$:5A3
$3454,67@

 


Understandable
controversy!
3-Minute Quiz: Indexing Skills
Q3: Good or Bad? (Column order)
CREATE INDEX tbl_idx ON tbl (a, b);
SELECT id, a, b FROM tbl
WHERE a = $1 AND b = $2;
SELECT id, a, b FROM tbl
WHERE b = $1;
3-Minute Quiz: Indexing Skills
Q3: Good or Bad? (Column order)
CREATE INDEX tbl_idx ON tbl (a, b);
SELECT id, a, b FROM tbl
WHERE a = $1 AND b = $2;
SELECT id, a, b FROM tbl
WHERE b = $1;
3-Minute Quiz: Indexing Skills
Q4: Good or Bad? (Indexing LIKE)
CREATE INDEX tbl_idx
ON tbl (text varchar_pattern_ops);
SELECT id, text
FROM tbl
WHERE text LIKE '%TERM%';
3-Minute Quiz: Indexing Skills
Q4: Good or Bad? (Indexing LIKE)
CREATE INDEX tbl_idx
ON tbl (text varchar_pattern_ops);
SELECT id, text
FROM tbl
WHERE text LIKE '%TERM%';
3-Minute Quiz: Indexing Skills
Q5: Good or Bad? (equality vs. ranges)
CREATE INDEX tbl_idx
ON tbl (date_col, state);
SELECT id, date_col, state FROM tbl
WHERE date_col =
CURRENT_DATE - INTERVAL '5' YEAR
AND state = 'X';
3-Minute Quiz: Indexing Skills
Q5: Good or Bad? (equality vs. ranges)
CREATE INDEX tbl_idx
ON tbl (date_col, state);
SELECT id, date_col, state FROM tbl
WHERE date_col =
CURRENT_DATE - INTERVAL '5' YEAR
AND state = 'X';
Indexes: The Neglected All-Rounder
Everybody knows indexing is
important for performance,
yet nobody takes the time to
learn and apply is properly.
Indexes: The Neglected All-Rounder
Index details are hardly known.
! “Details” like column-order or equality vs. range
conditions must be learned and understood.
Only one index capability is used: finding data quickly
! Indexes have three capabilities (powers):
finding data, clustering data, and sorting data.
Indexing is done from single query perspective.
! Should be done from application perspective
(considering all queries). It’s a design task!
Indexes: The Neglected All-Rounder
Are you just adding indexes
or
are you designing indexes?
About Markus Winand
Tuning developers for
high SQL performance
Training  co (one-man show):
winand.at
Geeky blog:
use-the-index-luke.com
Author of:
SQL Performance Explained

Indexes: The neglected performance all rounder

  • 1.
    © 2013 byMarkus Winand Indexes The Neglected Performance All-Rounder Not always that obvious, unfortunately! iStockPhoto wildpixel
  • 2.
    Takeaway #1: PandemicScale It affects you! (Symbolic image; not real data) http://upload.wikimedia.org/wikipedia/commons/c/c7/2009_world_subdivisions_flu_pandemic.png
  • 3.
    Takeaway #2: Causedby Success Copyright © 2013 Telerik, Inc. All rights reserved
  • 4.
    Takeaway #3: It’sNot Your Fault http://simpsonswiki.com/wiki/File:I_Didn%27t_Do_It!_Volume_III.png
  • 5.
    © 2013 byMarkus Winand The Problem Improper Index Use
  • 6.
    The Problem: ImproperIndex Use “A very common cause of performance problems is lack of proper indexes or the use of queries that are not using existing indexes.” —Buda Consulting http://www.budaconsulting.com/Portals/52677/docs/top_5_tech_brief.pdf
  • 7.
    The Problem: ImproperIndex Use “A very common cause of performance problems is lack of proper indexes or the use of queries that are not using existing indexes.” —Buda Consulting http://www.budaconsulting.com/Portals/52677/docs/top_5_tech_brief.pdf
  • 8.
    Quantifying the Problem PerconaWhite Paper: Reasons of performance problems that caused production downtime: 38% bad SQL 15% schema and indexing http://www.percona.com/files/white-papers/causes-of-downtime-in-mysql.pdf
  • 9.
    Quantifying the Problem Surveyby sqlskills.com: Root causes of the last few SQL Server performance problems: 27% T-SQL 19% Poor indexing http://www.sqlskills.com/blogs/paul/survey-what-are-the-most-common-causes-of-performance-problems/
  • 10.
    Quantifying the Problem CraigS. Mullins (strategist and researcher): „As much as 75% of poor relational performance is caused by "bad" SQL and application code.” Noel Yuhanna (Forrester Research): „The key difficulties surrounding performance continue to be poorly written SQL statements, improper DBMS configuration and a lack of clear understanding of how to tune databases to solve performance issues.”
  • 11.
    Quantifying the Problem Myobservation: ~50% of SQL performance problems are caused by improper index use
  • 12.
    © 2013 byMarkus Winand The Root Cause
  • 13.
    © 2013 byMarkus Winand The Root Cause Admins are Indexing
  • 14.
    The Root Cause:DBAs are Indexing How did databases work before SQL?
  • 15.
    The Root Cause:DBAs are Indexing Index use was intrinsically tied to the queries.
  • 16.
    The Root Cause:DBAs are Indexing Example: dBase Developers had to... ...use indexes explicitly when searching: !"#$%&'"($#)$*+!#,&+-" $$$.%&'$/%&+&' ...take care of index maintenance: !"#$%&'"($#)$*+!#,&+-"0$%'(1 $$$$+22"&'
  • 17.
    The Root Cause:DBAs are Indexing SQL is an abstraction that only defines the logical view. The actual SQL implementation takes care of everything else.
  • 18.
    The Root Cause:DBAs are Indexing Transactions Constraints Views Tables Data manipulation Queries SQL (language) has: SQL Databases (software) have:
  • 19.
    The Root Cause:DBAs are Indexing Backup & recovery Storage management Bugs & patches Tuning parameters Transactions Constraints Views Tables Data manipulation Queries SQL (language) has: SQL Databases (software) have: High Availability
  • 20.
    The Root Cause:DBAs are Indexing Indexes Backup & recovery Storage management Bugs & patches Tuning parameters Transactions Constraints Views Tables Data manipulation Queries SQL (language) has: SQL Databases (software) have: High Availability
  • 21.
    The Root Cause:DBAs are Indexing Indexes Backup & recovery Storage management Bugs & patches Tuning parameters Transactions Constraints Views Tables Data manipulation Queries SQL Databases (software) have: Developers High Availability
  • 22.
    The Root Cause:DBAs are Indexing Indexes Backup & recovery Storage management Bugs & patches Tuning parameters Transactions Constraints Views Tables Data manipulation Queries Developers Administrators High Availability
  • 23.
    The Root Cause:DBAs are Indexing Indexing is considered a system tuning task that belongs to the administrators responsibilities.
  • 24.
    The Root Cause:DBAs are Indexing A misconception that causes new problems:
  • 25.
    The Root Cause:DBAs are Indexing A misconception that causes new problems: DBAs don’t know the queries Have to “investigate” to find the queries. It is time consuming and almost always incomplete. by G-10gian82 deviantart.com
  • 26.
    The Root Cause:DBAs are Indexing A misconception that causes new problems: DBAs don’t know the queries Have to “investigate” to find the queries. It is time consuming and almost always incomplete. DBAs can’t change the queries Can make the index match the query. Can’t make the query match the index!
  • 27.
    © 2013 byMarkus Winand The Solution
  • 28.
    © 2013 byMarkus Winand The Solution Indexing is a Development Task
  • 29.
    The Solution: It’sa Dev Task Indexes Backup & recovery Storage management Tuning parameters Transactions Constraints Views Tables Data manipulation Queries Developers Administrators High Availability Bugs & patches
  • 30.
    The Solution: It’sa Dev Task Indexes Backup & recovery Storage management Tuning parameters Transactions Constraints Views Tables Data manipulation Queries Developers Administrators Must match! High Availability Bugs & patches
  • 31.
    Another Problem: It’snot Taught Indexes are not part of the pure SQL (language) literature because indexes are not part of the SQL standard. 11 SQL books analyzed: only 1.0% of the pages are about indexes (70 out of 7330 pages). Examples: Oracle SQL by Example: 2.0% (19/960) Beginning DBs with PostgreSQL: 0.8% (5/664) Learning SQL: 3.3% (11/336 — highest rate in class)
  • 32.
    Another Problem: It’snot Taught Proper index usage is sometimes covered in database tuning books but is always buried between hundreds of pages of HW, OS and DB parameterization topics. 14 database administration books analyzed: 5.1% of the pages are about indexes (307 out of 6069 pages). Examples: Oracle Performance Survival Guide: 5.2% (38/730) High Performance MySQL: 8% (55/684) PostgreSQL 9 High Performance: 5.8% (27/468)
  • 33.
    Another Problem: It’snot Taught Consequence: Developers don’t know how to use indexes properly.
  • 34.
    Another Problem: It’snot Taught Consequence: Developers don’t know how to use indexes properly. Results of the 3-minute online quiz: http://use-the-index-luke.com/3-minute-test 5 questions: each about a specific index usage pattern. Non-representative!
  • 35.
    Q1: Good orBad? (Function use) 345675$89:5;$#<*,%'($=9$#<*$>!"#$%&'()*+?@ A5B537$#"(#0$'+#",C)*D-& $$E4=F$#<* $/G545$,-%./012!"#$%&'()*+0$HIIIIH?$J$H1KLMH@ 3-Minute Quiz: Indexing Skills
  • 36.
    Q1: Good orBad? (Function use) 345675$89:5;$#<*,%'($=9$#<*$>!"#$%&'()*+?@ A5B537$#"(#0$'+#",C)*D-& $$E4=F$#<* $/G545$,-%./012!"#$%&'()*+0$HIIIIH?$J$H1KLMH@ 3-Minute Quiz: Indexing Skills
  • 37.
    Q1: Good orBad? (Function use) 345675$89:5;$#*,%'($=9$#*$!#$%'()*+?@ A5B537$#(#0$'+#,C)*D- $$E4=F$#* $/G545$,-%./012!#$%'()*+0$HIIIIH?$J$H1KLMH@ 3-Minute Quiz: Indexing Skills
  • 38.
    3-Minute Quiz: IndexingSkills Q2: Good or Bad? (Indexed Top-N, no IOS) 345675$89:5;$#*,%'($=9$#*$+0$'+#,C)*?@ A5B537$%'0$+0$'+#,C)* $$E4=F$#* $/G545$+$J$NL $=4:54$OI$'+#,C)*$:5A3 $3454,67@
  • 39.
    3-Minute Quiz: IndexingSkills Q2: Good or Bad? (Indexed Top-N, no IOS) 345675$89:5;$#*,%'($=9$#*$+0$'+#,C)*?@ A5B537$%'0$+0$'+#,C)* $$E4=F$#* $/G545$+$J$NL $=4:54$OI$'+#,C)*$:5A3 $3454,67@ Understandable controversy!
  • 40.
    3-Minute Quiz: IndexingSkills Q3: Good or Bad? (Column order) CREATE INDEX tbl_idx ON tbl (a, b); SELECT id, a, b FROM tbl WHERE a = $1 AND b = $2; SELECT id, a, b FROM tbl WHERE b = $1;
  • 41.
    3-Minute Quiz: IndexingSkills Q3: Good or Bad? (Column order) CREATE INDEX tbl_idx ON tbl (a, b); SELECT id, a, b FROM tbl WHERE a = $1 AND b = $2; SELECT id, a, b FROM tbl WHERE b = $1;
  • 42.
    3-Minute Quiz: IndexingSkills Q4: Good or Bad? (Indexing LIKE) CREATE INDEX tbl_idx ON tbl (text varchar_pattern_ops); SELECT id, text FROM tbl WHERE text LIKE '%TERM%';
  • 43.
    3-Minute Quiz: IndexingSkills Q4: Good or Bad? (Indexing LIKE) CREATE INDEX tbl_idx ON tbl (text varchar_pattern_ops); SELECT id, text FROM tbl WHERE text LIKE '%TERM%';
  • 44.
    3-Minute Quiz: IndexingSkills Q5: Good or Bad? (equality vs. ranges) CREATE INDEX tbl_idx ON tbl (date_col, state); SELECT id, date_col, state FROM tbl WHERE date_col = CURRENT_DATE - INTERVAL '5' YEAR AND state = 'X';
  • 45.
    3-Minute Quiz: IndexingSkills Q5: Good or Bad? (equality vs. ranges) CREATE INDEX tbl_idx ON tbl (date_col, state); SELECT id, date_col, state FROM tbl WHERE date_col = CURRENT_DATE - INTERVAL '5' YEAR AND state = 'X';
  • 46.
    Indexes: The NeglectedAll-Rounder Everybody knows indexing is important for performance, yet nobody takes the time to learn and apply is properly.
  • 47.
    Indexes: The NeglectedAll-Rounder Index details are hardly known. ! “Details” like column-order or equality vs. range conditions must be learned and understood. Only one index capability is used: finding data quickly ! Indexes have three capabilities (powers): finding data, clustering data, and sorting data. Indexing is done from single query perspective. ! Should be done from application perspective (considering all queries). It’s a design task!
  • 48.
    Indexes: The NeglectedAll-Rounder Are you just adding indexes or are you designing indexes?
  • 49.
    About Markus Winand Tuningdevelopers for high SQL performance Training co (one-man show): winand.at Geeky blog: use-the-index-luke.com Author of: SQL Performance Explained