SlideShare a Scribd company logo
Hey
I'm Haki Benita
A software developer and a technical lead.
I take special interest in databases, web development, software
design and performance tuning.
©hakibenita.com©hakibenita.com
What's So Special About SQL?
Available to different people inside the organization
Used not only by developers
Crucial for decision making
©hakibenita.com©hakibenita.com
Common Mistakes in SQL
©hakibenita.com©hakibenita.com
A Simple Sales Report
Branches in NY and LA
Customers are members or unknown
Some sales are given a special discount
Some products are free (give away)
©hakibenita.com©hakibenita.com
Sales Database
id branch sold_at customer product price discount
1 NY 2020-04-01 03:15:00+00 Bill Shoes 10000 1000
2 NY 2020-04-01 04:00:00+00 ¤ Shoes 5000 0
3 LA 2020-04-01 06:15:00+00 Lily Shoes 15000 0
4 LA 2020-04-01 09:10:00+00 John Shoes 5000 2500
5 NY 2020-04-01 03:15:00+00 ¤ Shirt 1500 0
6 NY 2020-04-01 02:07:00+00 John Shirt 1850 0
7 LA 2020-03-31 09:55:00+00 Bill Shirt 1250 0
8 LA 2020-03-31 10:45:00+00 Lily Shirt 1850 100
9 NY 2020-03-31 07:45:00+00 Lily Pants 5200 0
10 LA 2020-03-31 10:45:00+00 John Pants 5200 0
11 LA 2020-04-01 07:01:00+00 David Pants 4500 0
12 LA 2020-04-02 06:01:00+00 ¤ Hat 8000 8000
13 LA 2020-04-02 06:01:00+00 Bill Give Away 0 0
14 NY 2020-03-31 17:01:00+00 ¤ Give Away 0 0
15 LA 2020-04-01 10:45:00+00 ¤ Give Away 0 0
©hakibenita.com©hakibenita.com
A Simple Sales report...
What can possibly go wrong?
©hakibenita.com©hakibenita.com
What is the discount rate on Shoes
SELECT price, discount, discount / price * 100 as discount_rate
FROM sale
WHERE product ='Shoes';
price │ discount │ discount_rate
───────┼──────────┼───────────────
10000 │ 1000 │ 0
5000 │ 0 │ 0
15000 │ 0 │ 0
5000 │ 2500 │ 0
Is this correct?
©hakibenita.com©hakibenita.com
Be Careful When Dividing Integers
Integer division truncates the result
SELECT 1000 / 10000;
?column?
----------
0
SELECT 1000 / 10000::float;
?column?
----------
0.1
SELECT 1000 / 10000::float * 100;
?column?
----------
10
©hakibenita.com©hakibenita.com
What is the discount rate on Shoes
SELECT price, discount, discount / price::float * 100 as discount_rate
FROM sale
WHERE product ='Shoes';
price │ discount │ discount_rate
───────┼──────────┼───────────────
10000 │ 1000 │ 10
5000 │ 0 │ 0
15000 │ 0 │ 0
5000 │ 2500 │ 50
©hakibenita.com©hakibenita.com
What is the average discount rate by product
SELECT
product,
AVG(discount / price::float) * 100 as discount_rate
FROM sale
GROUP BY product;
ERROR: division by zero
©hakibenita.com©hakibenita.com
What is the average discount rate by product
SELECT id, product, price, discount
FROM sale
WHERE price = 0;
id │ product │ price │ discount
────┼───────────┼───────┼──────────
13 │ Give Away │ 0 │ 0
14 │ Give Away │ 0 │ 0
15 │ Give Away │ 0 │ 0
Product “Give Away” price is zero and it causes division by zero
©hakibenita.com©hakibenita.com
Guard Against "division by zero" Errors
Use NULLIF
SELECT
id, product, price, discount,
discount / NULLIF(price, 0) AS discount_rate
FROM sale
WHERE price = 0;
id │ product │ price │ discount │ discount_rate
────┼───────────┼───────┼──────────┼───────────────
13 │ Give Away │ 0 │ 0 │ ¤
14 │ Give Away │ 0 │ 0 │ ¤
15 │ Give Away │ 0 │ 0 │ ¤
©hakibenita.com©hakibenita.com
Guard Against "division by zero" Errors
Use COALESCE
SELECT
product,
COALESCE(AVG(discount / NULLIF(price, 0)::float), 0) * 100
FROM sale
GROUP BY product;
product │ ?column?
───────────┼────────────────────
Shirt │ 1.3513513513513513
Pants │ 0
Hat │ 100
Shoes │ 15
Give Away │ 0
©hakibenita.com©hakibenita.com
How many unique users purchased each product
SELECT
product,
COUNT(DISTINCT customer) AS customers
FROM sale
GROUP BY product;
product │ customers
───────────┼───────────
Give Away │ 1
Hat │ 0 <--- ??
Pants │ 3
Shirt │ 3
Shoes │ 3
Is this correct?
©hakibenita.com©hakibenita.com
Be Careful Aggregating Nullable Column
Aggregate functions ignore NULL values
SELECT product, COUNT(*) AS cnt, COUNT(customer) AS cnt_customer
FROM sale
GROUP BY product;
product │ cnt │ cnt_customer
───────────┼─────┼──────────────
Shirt │ 4 │ 3
Pants │ 3 │ 3
Hat │ 1 │ 0
Shoes │ 4 │ 3
Give Away │ 3 │ 1
This can also be useful...
©hakibenita.com©hakibenita.com
How many members purchased each product
Aggregate functions ignore NULL values
SELECT
product,
COUNT(customer) as members,
COUNT(*) - COUNT(customer) as unknown_customers
FROM sale
GROUP BY product;
product │ members │ unknown_customers
───────────┼─────────┼───────────────────
Shirt │ 3 │ 1
Pants │ 3 │ 0
Hat │ 0 │ 1
Shoes │ 3 │ 1
Give Away │ 1 │ 2
©hakibenita.com©hakibenita.com
Write a query to find sales by a any customer
You need to use a parameter...
SELECT * FROM sale WHERE customer = :name;
©hakibenita.com©hakibenita.com
Sales to "Bill"
set name '''Bill'''
SELECT id, customer FROM sale WHERE customer = :name;
id │ customer
────┼──────────
1 │ Bill
7 │ Bill
13 │ Bill
©hakibenita.com©hakibenita.com
Sales to unknown customers
set name null
SELECT id, customer FROM sale WHERE customer = :name;
id │ customer
────┼──────────
(0 rows)
Is this correct?
©hakibenita.com©hakibenita.com
Be Careful Comparing NULL
To compare null use IS
Condition Result
NULL = NULL NULL
(NULL = NULL) IS NULL t
NULL IS NULL t
©hakibenita.com©hakibenita.com
Write a query to find sales by a any customer
We need to handle two cases...
SELECT *
FROM sale
WHERE
(:name IS NULL AND customer IS NULL)
OR
(:name IS NOT NULL AND customer = :name);
©hakibenita.com©hakibenita.com
Write a query to find sales by a any customer
-- A member
set name '''Bill'''
SELECT id, customer ...
id │ customer
────┼──────────
1 │ Bill
7 │ Bill
...
-- Unknown customers
set name null
SELECT id, customer ..
id │ customer
────┼──────────
2 │ ¤
5 │ ¤
...
There must be a better way!
©hakibenita.com©hakibenita.com
Be Careful Comparing NULL
Use IS DISTINCT FROM
SELECT *
FROM sale
WHERE customer IS NOT DISTINCT FROM :name;
©hakibenita.com©hakibenita.com
IS DISTINCT FROM
Treats NULL like a literal value
a │ b │ equal │ is_not_distinct_from
───┼───┼───────┼──────────────────────
1 │ 1 │ t │ t
1 │ 2 │ f │ f
1 │ ¤ │ ¤ │ f
¤ │ ¤ │ ¤ │ t
©hakibenita.com©hakibenita.com
Find the amount of sales during each month
SELECT
date_trunc('month', sold_at) AS month,
sum(price) AS total_sales
FROM sale
GROUP BY month;
month │ total_sales
────────────────────────┼─────────────
2020-04-01 00:00:00-04 │ 37500
2020-03-01 00:00:00-05 │ 26850
©hakibenita.com©hakibenita.com
Find the amount of sales during each month
What's the problem?
SET TIME ZONE 'America/New_York';
SELECT...
month │ total_sales
────────────────────────┼─────────────
2020-04-01 00:00:00-04 │ 37500
2020-03-01 00:00:00-05 │ 26850
SET TIME ZONE 'America/Los_Angeles';
SELECT...
month │ total_sales
────────────────────────┼─────────────
2020-04-01 00:00:00-07 │ 17500
2020-03-01 00:00:00-08 │ 46850
©hakibenita.com©hakibenita.com
Date and Times Are Hard!
Time Zones
Daylight Saving
Database types
©hakibenita.com©hakibenita.com
Date and Times Are Hard!
When truncating timestamps always set a time zone
Wrong! Right!
date_trunc('month', sold_at) date_trunc('month', sold_at at time zone 'America/New_York')
extract('hour' from sold_at) extract('hour' from sold_at at time zone 'America/New_York')
date_part('hour', sold_at) date_part('hour', sold_at at time zone 'America/New_York')
sold_at::date (sold_at at time zone 'America/New_York')::date
'2020-03-22 11:00' '2020-03-22 11:00 America/New_York'
©hakibenita.com©hakibenita.com
What is the busiest hour of the day?
SELECT
extract('hour' from sold_at at time zone 'America/New_York'),
COUNT(*)
FROM sale
GROUP BY 1;
hour_of_day | sales
────────────┼───────
10 | 3
6 | 3
9 | 2
...
Is this correct?
©hakibenita.com©hakibenita.com
What is the busiest hour of the day?
SELECT
extract('hour' FROM sold_at AT TIME ZONE CASE
WHEN branch = 'NY' THEN 'America/New_York'
WHEN branch = 'LA' THEN 'America/Los_Angeles'
END) AS hour_of_day,
COUNT(*) AS sales
FROM sale
GROUP BY hour_of_day
ORDER BY sales DESC;
hour_of_day | sales
────────────┼───────
23 | 5
3 | 4
0 | 2
...
©hakibenita.com©hakibenita.com
How many sales were there in March?
In America/New_York time
SELECT count(*)
FROM sale
WHERE sold_at
BETWEEN '2020-03-01 America/New_York'
AND '2020-04-01 America/New_York';
count
───────
9
©hakibenita.com©hakibenita.com
How many sales were there in April?
In America/New_York time
SELECT count(*)
FROM sale
WHERE sold_at
BETWEEN '2020-04-01 America/New_York'
AND '2020-05-01 America/New_York';
count
───────
7
Is this correct?
©hakibenita.com©hakibenita.com
How many sales are there???
SELECT count(*) FROM sale;
count
───────
15
9 + 7 != 15
©hakibenita.com©hakibenita.com
Between is inclusive
One sale is counted twice
SELECT id, sold_at
FROM sale
WHERE sold_at = '2020-04-01 America/New_York';
id │ sold_at
────┼────────────────────────
2 │ 2020-04-01 00:00:00-04
Sale happened exactly at midnight...
©hakibenita.com©hakibenita.com
Between is inclusive
Use Half Open Ranges that don’t overlap
SELECT count(*)
FROM sale
WHERE sold_at >= '2020-03-01 America/New_York'
AND sold_at < '2020-04-01 America/New_York';
count
───────
8
It's longer, but safer!
©hakibenita.com©hakibenita.com
Between is inclusive
Integers and Float: Binning, Dividing to buckets...
Dates and Timestamps: Search for values in date range
-- Search for 90's babies... Wrong!
SELECT *
FROM birthdates
WHERE birthdate BETWEEN '1990-01-01' AND '2000-01-01';
©hakibenita.com©hakibenita.com
Recap
Be careful when dividing integers
Dividing by an integer truncates the result
Guard against "division by zero" errors
Use NULLIF and COALESCE
Be careful when aggregating nulls
Aggregate functions ignore NULL values
Be careful when comparing nulls
Use IS DISTINCT FROM to treat NULL like a value
Be careful with timestamps
Explicitly set a time zone truncating timestamps
Between is inclusive
Use half open ranges that don’t overlap
©hakibenita.com©hakibenita.com
Missed Optimization Opportunities
The ways we prevent databases from doing their thing...
©hakibenita.com©hakibenita.com
Avoid Transformations on Indexed Fields
Transformation can prevent the database from using an index!
-- Bad!
SELECT * FROM sale WHERE customer = 'Bill';
Execution Time: 4.309 ms
-- Good!
SELECT * FROM sale WHERE lower(customer) = 'bill';
Execution Time: 34.914 ms
©hakibenita.com©hakibenita.com
Avoid Transformations on Indexed Fields
Simple arithmetics
-- Bad!
SELECT * FROM sale WHERE id + 1 = 100;
Execution Time: 24.352 ms
-- Good!
SELECT * FROM sale WHERE id = 100 - 1;
Execution Time: 0.045 ms
©hakibenita.com©hakibenita.com
Avoid Transformations on Indexed Fields
Setting a time zone
-- Bad!
SELECT * FROM sale
WHERE sold_at at time zone 'America/New_York' > '2021-01-01';
Execution Time: 59.117 ms
-- Good!
SELECT * FROM sale
WHERE sold_at > '2021-01-01 America/New_York';
Execution Time: 0.036 ms
©hakibenita.com©hakibenita.com
Avoid Transformations on Indexed Fields
Date arithmetics
-- Bad!
SELECT *
FROM sale
WHERE sold_at - interval '1 day' > '2021-01-01 America/New_York';
Execution Time: 48.077 ms
-- Good!
SELECT *
FROM sale
WHERE sold_at > '2021-01-01 America/New_York'::timestamptz
+ interval '1 day';
Execution Time: 0.023 ms
©hakibenita.com©hakibenita.com
Avoid Transformations on Indexed Fields
String manipulation
-- Bad!
SELECT * FROM users WHERE lower(email) = 'me@hakibenita.com';
-- Good!
SELECT * FROM users WHERE email = lower('Me@HakiBenita.com');
©hakibenita.com©hakibenita.com
Avoid Transformations on Indexed Fields
String concatenation
-- Bad!
SELECT * FROM users
WHERE first_name || ' ' || last_name = 'Haki Benita'
-- Good!
SELECT * FROM users
WHERE first_name = 'Haki' AND last_name = 'Benita'
©hakibenita.com©hakibenita.com
UNION vs. UNION ALL
Concatenate results
SELECT 1 UNION ALL SELECT 1;
?column?
----------
1
1
Concatenate results and removes duplicates
SELECT 1 UNION SELECT 1;
?column?
----------
1
©hakibenita.com©hakibenita.com
This is it!
Check out my blog hakibenita.com
Find me on Twitter @be_haki
Subscribe to my newsletter at hakibenita.com/subscribe
Send me an email me@hakibenita.com
©hakibenita.com©hakibenita.com

More Related Content

Similar to Common Mistakes and Missed Optimization Opportunities in SQL

Ignite M 4 aligned Gold standard Template-1667991866410 (1).pptx
Ignite M 4 aligned Gold standard Template-1667991866410 (1).pptxIgnite M 4 aligned Gold standard Template-1667991866410 (1).pptx
Ignite M 4 aligned Gold standard Template-1667991866410 (1).pptx
AdityaPutra836638
 
Start-up Financial Forecasting
Start-up Financial ForecastingStart-up Financial Forecasting
Start-up Financial Forecasting
Eric Tachibana
 
Pharma Software
Pharma SoftwarePharma Software
Pharma Software
Corporate Services
 
ProblemThe Quilting Pad is a retail store that sells materials for.docx
ProblemThe Quilting Pad is a retail store that sells materials for.docxProblemThe Quilting Pad is a retail store that sells materials for.docx
ProblemThe Quilting Pad is a retail store that sells materials for.docx
wkyra78
 
Managerial accounting
Managerial accounting   Managerial accounting
Managerial accounting
College
 
COPA-1-0.pptx
COPA-1-0.pptxCOPA-1-0.pptx
COPA-1-0.pptx
Amit Gupta
 
Lyoness Presentation 06.11.2010
Lyoness Presentation  06.11.2010Lyoness Presentation  06.11.2010
Lyoness Presentation 06.11.2010
midago
 
Public user profile and subscriptions on Avito
Public user profile and subscriptions on AvitoPublic user profile and subscriptions on Avito
Public user profile and subscriptions on Avito
Vladimir Merkushev
 
Tally.Erp 9 Job Costing Ver 1.0
Tally.Erp 9 Job Costing Ver 1.0Tally.Erp 9 Job Costing Ver 1.0
Tally.Erp 9 Job Costing Ver 1.0
ravi78
 
Pitch deck zespołu Survicate
Pitch deck zespołu SurvicatePitch deck zespołu Survicate
Pitch deck zespołu Survicate
Adam Łopusiewicz
 
Seed Stage Pitch Deck Template For Founders
Seed Stage Pitch Deck Template For FoundersSeed Stage Pitch Deck Template For Founders
Seed Stage Pitch Deck Template For Founders
NextView Ventures
 
Andrew Davies (Paddle) - From Zero to $350m Revenue: Finding and Scaling Your...
Andrew Davies (Paddle) - From Zero to $350m Revenue: Finding and Scaling Your...Andrew Davies (Paddle) - From Zero to $350m Revenue: Finding and Scaling Your...
Andrew Davies (Paddle) - From Zero to $350m Revenue: Finding and Scaling Your...
Techsylvania
 
Expanding Your Sales Funnel with SAS
Expanding Your Sales Funnel with SASExpanding Your Sales Funnel with SAS
Expanding Your Sales Funnel with SAS
Michael Mina
 
Tracker updation
Tracker updationTracker updation
Tracker updation
tusharahuja87
 
Training tracker making
Training tracker makingTraining tracker making
Training tracker making
tusharahuja87
 
Ignite Business Model.pptx
Ignite Business Model.pptxIgnite Business Model.pptx
Ignite Business Model.pptx
KhushalJha
 
VisaCheckout_BrandingAndUXRequirements_201604
VisaCheckout_BrandingAndUXRequirements_201604VisaCheckout_BrandingAndUXRequirements_201604
VisaCheckout_BrandingAndUXRequirements_201604
Erica Lee
 
Where SaaS Stands Today: Benchmarks and Lessons from 17,479 SaaS Companies wi...
Where SaaS Stands Today: Benchmarks and Lessons from 17,479 SaaS Companies wi...Where SaaS Stands Today: Benchmarks and Lessons from 17,479 SaaS Companies wi...
Where SaaS Stands Today: Benchmarks and Lessons from 17,479 SaaS Companies wi...
saastr
 
Management consultancy-chapter-26-and-35
Management consultancy-chapter-26-and-35Management consultancy-chapter-26-and-35
Management consultancy-chapter-26-and-35
Holy Cross College
 
BizSol ERP
BizSol ERP BizSol ERP
BizSol ERP
Daya Shankar
 

Similar to Common Mistakes and Missed Optimization Opportunities in SQL (20)

Ignite M 4 aligned Gold standard Template-1667991866410 (1).pptx
Ignite M 4 aligned Gold standard Template-1667991866410 (1).pptxIgnite M 4 aligned Gold standard Template-1667991866410 (1).pptx
Ignite M 4 aligned Gold standard Template-1667991866410 (1).pptx
 
Start-up Financial Forecasting
Start-up Financial ForecastingStart-up Financial Forecasting
Start-up Financial Forecasting
 
Pharma Software
Pharma SoftwarePharma Software
Pharma Software
 
ProblemThe Quilting Pad is a retail store that sells materials for.docx
ProblemThe Quilting Pad is a retail store that sells materials for.docxProblemThe Quilting Pad is a retail store that sells materials for.docx
ProblemThe Quilting Pad is a retail store that sells materials for.docx
 
Managerial accounting
Managerial accounting   Managerial accounting
Managerial accounting
 
COPA-1-0.pptx
COPA-1-0.pptxCOPA-1-0.pptx
COPA-1-0.pptx
 
Lyoness Presentation 06.11.2010
Lyoness Presentation  06.11.2010Lyoness Presentation  06.11.2010
Lyoness Presentation 06.11.2010
 
Public user profile and subscriptions on Avito
Public user profile and subscriptions on AvitoPublic user profile and subscriptions on Avito
Public user profile and subscriptions on Avito
 
Tally.Erp 9 Job Costing Ver 1.0
Tally.Erp 9 Job Costing Ver 1.0Tally.Erp 9 Job Costing Ver 1.0
Tally.Erp 9 Job Costing Ver 1.0
 
Pitch deck zespołu Survicate
Pitch deck zespołu SurvicatePitch deck zespołu Survicate
Pitch deck zespołu Survicate
 
Seed Stage Pitch Deck Template For Founders
Seed Stage Pitch Deck Template For FoundersSeed Stage Pitch Deck Template For Founders
Seed Stage Pitch Deck Template For Founders
 
Andrew Davies (Paddle) - From Zero to $350m Revenue: Finding and Scaling Your...
Andrew Davies (Paddle) - From Zero to $350m Revenue: Finding and Scaling Your...Andrew Davies (Paddle) - From Zero to $350m Revenue: Finding and Scaling Your...
Andrew Davies (Paddle) - From Zero to $350m Revenue: Finding and Scaling Your...
 
Expanding Your Sales Funnel with SAS
Expanding Your Sales Funnel with SASExpanding Your Sales Funnel with SAS
Expanding Your Sales Funnel with SAS
 
Tracker updation
Tracker updationTracker updation
Tracker updation
 
Training tracker making
Training tracker makingTraining tracker making
Training tracker making
 
Ignite Business Model.pptx
Ignite Business Model.pptxIgnite Business Model.pptx
Ignite Business Model.pptx
 
VisaCheckout_BrandingAndUXRequirements_201604
VisaCheckout_BrandingAndUXRequirements_201604VisaCheckout_BrandingAndUXRequirements_201604
VisaCheckout_BrandingAndUXRequirements_201604
 
Where SaaS Stands Today: Benchmarks and Lessons from 17,479 SaaS Companies wi...
Where SaaS Stands Today: Benchmarks and Lessons from 17,479 SaaS Companies wi...Where SaaS Stands Today: Benchmarks and Lessons from 17,479 SaaS Companies wi...
Where SaaS Stands Today: Benchmarks and Lessons from 17,479 SaaS Companies wi...
 
Management consultancy-chapter-26-and-35
Management consultancy-chapter-26-and-35Management consultancy-chapter-26-and-35
Management consultancy-chapter-26-and-35
 
BizSol ERP
BizSol ERP BizSol ERP
BizSol ERP
 

More from EDB

Cloud Migration Paths: Kubernetes, IaaS, or DBaaS
Cloud Migration Paths: Kubernetes, IaaS, or DBaaSCloud Migration Paths: Kubernetes, IaaS, or DBaaS
Cloud Migration Paths: Kubernetes, IaaS, or DBaaS
EDB
 
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr Unternehmen
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr UnternehmenDie 10 besten PostgreSQL-Replikationsstrategien für Ihr Unternehmen
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr Unternehmen
EDB
 
Migre sus bases de datos Oracle a la nube
Migre sus bases de datos Oracle a la nube Migre sus bases de datos Oracle a la nube
Migre sus bases de datos Oracle a la nube
EDB
 
EFM Office Hours - APJ - July 29, 2021
EFM Office Hours - APJ - July 29, 2021EFM Office Hours - APJ - July 29, 2021
EFM Office Hours - APJ - July 29, 2021
EDB
 
Benchmarking Cloud Native PostgreSQL
Benchmarking Cloud Native PostgreSQLBenchmarking Cloud Native PostgreSQL
Benchmarking Cloud Native PostgreSQL
EDB
 
Las Variaciones de la Replicación de PostgreSQL
Las Variaciones de la Replicación de PostgreSQLLas Variaciones de la Replicación de PostgreSQL
Las Variaciones de la Replicación de PostgreSQL
EDB
 
NoSQL and Spatial Database Capabilities using PostgreSQL
NoSQL and Spatial Database Capabilities using PostgreSQLNoSQL and Spatial Database Capabilities using PostgreSQL
NoSQL and Spatial Database Capabilities using PostgreSQL
EDB
 
Is There Anything PgBouncer Can’t Do?
Is There Anything PgBouncer Can’t Do?Is There Anything PgBouncer Can’t Do?
Is There Anything PgBouncer Can’t Do?
EDB
 
Data Analysis with TensorFlow in PostgreSQL
Data Analysis with TensorFlow in PostgreSQLData Analysis with TensorFlow in PostgreSQL
Data Analysis with TensorFlow in PostgreSQL
EDB
 
Practical Partitioning in Production with Postgres
Practical Partitioning in Production with PostgresPractical Partitioning in Production with Postgres
Practical Partitioning in Production with Postgres
EDB
 
A Deeper Dive into EXPLAIN
A Deeper Dive into EXPLAINA Deeper Dive into EXPLAIN
A Deeper Dive into EXPLAIN
EDB
 
IOT with PostgreSQL
IOT with PostgreSQLIOT with PostgreSQL
IOT with PostgreSQL
EDB
 
A Journey from Oracle to PostgreSQL
A Journey from Oracle to PostgreSQLA Journey from Oracle to PostgreSQL
A Journey from Oracle to PostgreSQL
EDB
 
Psql is awesome!
Psql is awesome!Psql is awesome!
Psql is awesome!
EDB
 
EDB 13 - New Enhancements for Security and Usability - APJ
EDB 13 - New Enhancements for Security and Usability - APJEDB 13 - New Enhancements for Security and Usability - APJ
EDB 13 - New Enhancements for Security and Usability - APJ
EDB
 
Comment sauvegarder correctement vos données
Comment sauvegarder correctement vos donnéesComment sauvegarder correctement vos données
Comment sauvegarder correctement vos données
EDB
 
Cloud Native PostgreSQL - Italiano
Cloud Native PostgreSQL - ItalianoCloud Native PostgreSQL - Italiano
Cloud Native PostgreSQL - Italiano
EDB
 
New enhancements for security and usability in EDB 13
New enhancements for security and usability in EDB 13New enhancements for security and usability in EDB 13
New enhancements for security and usability in EDB 13
EDB
 
Best Practices in Security with PostgreSQL
Best Practices in Security with PostgreSQLBest Practices in Security with PostgreSQL
Best Practices in Security with PostgreSQL
EDB
 
Cloud Native PostgreSQL - APJ
Cloud Native PostgreSQL - APJCloud Native PostgreSQL - APJ
Cloud Native PostgreSQL - APJ
EDB
 

More from EDB (20)

Cloud Migration Paths: Kubernetes, IaaS, or DBaaS
Cloud Migration Paths: Kubernetes, IaaS, or DBaaSCloud Migration Paths: Kubernetes, IaaS, or DBaaS
Cloud Migration Paths: Kubernetes, IaaS, or DBaaS
 
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr Unternehmen
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr UnternehmenDie 10 besten PostgreSQL-Replikationsstrategien für Ihr Unternehmen
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr Unternehmen
 
Migre sus bases de datos Oracle a la nube
Migre sus bases de datos Oracle a la nube Migre sus bases de datos Oracle a la nube
Migre sus bases de datos Oracle a la nube
 
EFM Office Hours - APJ - July 29, 2021
EFM Office Hours - APJ - July 29, 2021EFM Office Hours - APJ - July 29, 2021
EFM Office Hours - APJ - July 29, 2021
 
Benchmarking Cloud Native PostgreSQL
Benchmarking Cloud Native PostgreSQLBenchmarking Cloud Native PostgreSQL
Benchmarking Cloud Native PostgreSQL
 
Las Variaciones de la Replicación de PostgreSQL
Las Variaciones de la Replicación de PostgreSQLLas Variaciones de la Replicación de PostgreSQL
Las Variaciones de la Replicación de PostgreSQL
 
NoSQL and Spatial Database Capabilities using PostgreSQL
NoSQL and Spatial Database Capabilities using PostgreSQLNoSQL and Spatial Database Capabilities using PostgreSQL
NoSQL and Spatial Database Capabilities using PostgreSQL
 
Is There Anything PgBouncer Can’t Do?
Is There Anything PgBouncer Can’t Do?Is There Anything PgBouncer Can’t Do?
Is There Anything PgBouncer Can’t Do?
 
Data Analysis with TensorFlow in PostgreSQL
Data Analysis with TensorFlow in PostgreSQLData Analysis with TensorFlow in PostgreSQL
Data Analysis with TensorFlow in PostgreSQL
 
Practical Partitioning in Production with Postgres
Practical Partitioning in Production with PostgresPractical Partitioning in Production with Postgres
Practical Partitioning in Production with Postgres
 
A Deeper Dive into EXPLAIN
A Deeper Dive into EXPLAINA Deeper Dive into EXPLAIN
A Deeper Dive into EXPLAIN
 
IOT with PostgreSQL
IOT with PostgreSQLIOT with PostgreSQL
IOT with PostgreSQL
 
A Journey from Oracle to PostgreSQL
A Journey from Oracle to PostgreSQLA Journey from Oracle to PostgreSQL
A Journey from Oracle to PostgreSQL
 
Psql is awesome!
Psql is awesome!Psql is awesome!
Psql is awesome!
 
EDB 13 - New Enhancements for Security and Usability - APJ
EDB 13 - New Enhancements for Security and Usability - APJEDB 13 - New Enhancements for Security and Usability - APJ
EDB 13 - New Enhancements for Security and Usability - APJ
 
Comment sauvegarder correctement vos données
Comment sauvegarder correctement vos donnéesComment sauvegarder correctement vos données
Comment sauvegarder correctement vos données
 
Cloud Native PostgreSQL - Italiano
Cloud Native PostgreSQL - ItalianoCloud Native PostgreSQL - Italiano
Cloud Native PostgreSQL - Italiano
 
New enhancements for security and usability in EDB 13
New enhancements for security and usability in EDB 13New enhancements for security and usability in EDB 13
New enhancements for security and usability in EDB 13
 
Best Practices in Security with PostgreSQL
Best Practices in Security with PostgreSQLBest Practices in Security with PostgreSQL
Best Practices in Security with PostgreSQL
 
Cloud Native PostgreSQL - APJ
Cloud Native PostgreSQL - APJCloud Native PostgreSQL - APJ
Cloud Native PostgreSQL - APJ
 

Recently uploaded

Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
Intelisync
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Tatiana Kojar
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Precisely
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
alexjohnson7307
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
Dinusha Kumarasiri
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
Hiike
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
LucaBarbaro3
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
Javier Junquera
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
ScyllaDB
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 

Recently uploaded (20)

Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 

Common Mistakes and Missed Optimization Opportunities in SQL

  • 1. Hey I'm Haki Benita A software developer and a technical lead. I take special interest in databases, web development, software design and performance tuning. ©hakibenita.com©hakibenita.com
  • 2. What's So Special About SQL? Available to different people inside the organization Used not only by developers Crucial for decision making ©hakibenita.com©hakibenita.com
  • 3. Common Mistakes in SQL ©hakibenita.com©hakibenita.com
  • 4. A Simple Sales Report Branches in NY and LA Customers are members or unknown Some sales are given a special discount Some products are free (give away) ©hakibenita.com©hakibenita.com
  • 5. Sales Database id branch sold_at customer product price discount 1 NY 2020-04-01 03:15:00+00 Bill Shoes 10000 1000 2 NY 2020-04-01 04:00:00+00 ¤ Shoes 5000 0 3 LA 2020-04-01 06:15:00+00 Lily Shoes 15000 0 4 LA 2020-04-01 09:10:00+00 John Shoes 5000 2500 5 NY 2020-04-01 03:15:00+00 ¤ Shirt 1500 0 6 NY 2020-04-01 02:07:00+00 John Shirt 1850 0 7 LA 2020-03-31 09:55:00+00 Bill Shirt 1250 0 8 LA 2020-03-31 10:45:00+00 Lily Shirt 1850 100 9 NY 2020-03-31 07:45:00+00 Lily Pants 5200 0 10 LA 2020-03-31 10:45:00+00 John Pants 5200 0 11 LA 2020-04-01 07:01:00+00 David Pants 4500 0 12 LA 2020-04-02 06:01:00+00 ¤ Hat 8000 8000 13 LA 2020-04-02 06:01:00+00 Bill Give Away 0 0 14 NY 2020-03-31 17:01:00+00 ¤ Give Away 0 0 15 LA 2020-04-01 10:45:00+00 ¤ Give Away 0 0 ©hakibenita.com©hakibenita.com
  • 6. A Simple Sales report... What can possibly go wrong? ©hakibenita.com©hakibenita.com
  • 7. What is the discount rate on Shoes SELECT price, discount, discount / price * 100 as discount_rate FROM sale WHERE product ='Shoes'; price │ discount │ discount_rate ───────┼──────────┼─────────────── 10000 │ 1000 │ 0 5000 │ 0 │ 0 15000 │ 0 │ 0 5000 │ 2500 │ 0 Is this correct? ©hakibenita.com©hakibenita.com
  • 8. Be Careful When Dividing Integers Integer division truncates the result SELECT 1000 / 10000; ?column? ---------- 0 SELECT 1000 / 10000::float; ?column? ---------- 0.1 SELECT 1000 / 10000::float * 100; ?column? ---------- 10 ©hakibenita.com©hakibenita.com
  • 9. What is the discount rate on Shoes SELECT price, discount, discount / price::float * 100 as discount_rate FROM sale WHERE product ='Shoes'; price │ discount │ discount_rate ───────┼──────────┼─────────────── 10000 │ 1000 │ 10 5000 │ 0 │ 0 15000 │ 0 │ 0 5000 │ 2500 │ 50 ©hakibenita.com©hakibenita.com
  • 10. What is the average discount rate by product SELECT product, AVG(discount / price::float) * 100 as discount_rate FROM sale GROUP BY product; ERROR: division by zero ©hakibenita.com©hakibenita.com
  • 11. What is the average discount rate by product SELECT id, product, price, discount FROM sale WHERE price = 0; id │ product │ price │ discount ────┼───────────┼───────┼────────── 13 │ Give Away │ 0 │ 0 14 │ Give Away │ 0 │ 0 15 │ Give Away │ 0 │ 0 Product “Give Away” price is zero and it causes division by zero ©hakibenita.com©hakibenita.com
  • 12. Guard Against "division by zero" Errors Use NULLIF SELECT id, product, price, discount, discount / NULLIF(price, 0) AS discount_rate FROM sale WHERE price = 0; id │ product │ price │ discount │ discount_rate ────┼───────────┼───────┼──────────┼─────────────── 13 │ Give Away │ 0 │ 0 │ ¤ 14 │ Give Away │ 0 │ 0 │ ¤ 15 │ Give Away │ 0 │ 0 │ ¤ ©hakibenita.com©hakibenita.com
  • 13. Guard Against "division by zero" Errors Use COALESCE SELECT product, COALESCE(AVG(discount / NULLIF(price, 0)::float), 0) * 100 FROM sale GROUP BY product; product │ ?column? ───────────┼──────────────────── Shirt │ 1.3513513513513513 Pants │ 0 Hat │ 100 Shoes │ 15 Give Away │ 0 ©hakibenita.com©hakibenita.com
  • 14. How many unique users purchased each product SELECT product, COUNT(DISTINCT customer) AS customers FROM sale GROUP BY product; product │ customers ───────────┼─────────── Give Away │ 1 Hat │ 0 <--- ?? Pants │ 3 Shirt │ 3 Shoes │ 3 Is this correct? ©hakibenita.com©hakibenita.com
  • 15. Be Careful Aggregating Nullable Column Aggregate functions ignore NULL values SELECT product, COUNT(*) AS cnt, COUNT(customer) AS cnt_customer FROM sale GROUP BY product; product │ cnt │ cnt_customer ───────────┼─────┼────────────── Shirt │ 4 │ 3 Pants │ 3 │ 3 Hat │ 1 │ 0 Shoes │ 4 │ 3 Give Away │ 3 │ 1 This can also be useful... ©hakibenita.com©hakibenita.com
  • 16. How many members purchased each product Aggregate functions ignore NULL values SELECT product, COUNT(customer) as members, COUNT(*) - COUNT(customer) as unknown_customers FROM sale GROUP BY product; product │ members │ unknown_customers ───────────┼─────────┼─────────────────── Shirt │ 3 │ 1 Pants │ 3 │ 0 Hat │ 0 │ 1 Shoes │ 3 │ 1 Give Away │ 1 │ 2 ©hakibenita.com©hakibenita.com
  • 17. Write a query to find sales by a any customer You need to use a parameter... SELECT * FROM sale WHERE customer = :name; ©hakibenita.com©hakibenita.com
  • 18. Sales to "Bill" set name '''Bill''' SELECT id, customer FROM sale WHERE customer = :name; id │ customer ────┼────────── 1 │ Bill 7 │ Bill 13 │ Bill ©hakibenita.com©hakibenita.com
  • 19. Sales to unknown customers set name null SELECT id, customer FROM sale WHERE customer = :name; id │ customer ────┼────────── (0 rows) Is this correct? ©hakibenita.com©hakibenita.com
  • 20. Be Careful Comparing NULL To compare null use IS Condition Result NULL = NULL NULL (NULL = NULL) IS NULL t NULL IS NULL t ©hakibenita.com©hakibenita.com
  • 21. Write a query to find sales by a any customer We need to handle two cases... SELECT * FROM sale WHERE (:name IS NULL AND customer IS NULL) OR (:name IS NOT NULL AND customer = :name); ©hakibenita.com©hakibenita.com
  • 22. Write a query to find sales by a any customer -- A member set name '''Bill''' SELECT id, customer ... id │ customer ────┼────────── 1 │ Bill 7 │ Bill ... -- Unknown customers set name null SELECT id, customer .. id │ customer ────┼────────── 2 │ ¤ 5 │ ¤ ... There must be a better way! ©hakibenita.com©hakibenita.com
  • 23. Be Careful Comparing NULL Use IS DISTINCT FROM SELECT * FROM sale WHERE customer IS NOT DISTINCT FROM :name; ©hakibenita.com©hakibenita.com
  • 24. IS DISTINCT FROM Treats NULL like a literal value a │ b │ equal │ is_not_distinct_from ───┼───┼───────┼────────────────────── 1 │ 1 │ t │ t 1 │ 2 │ f │ f 1 │ ¤ │ ¤ │ f ¤ │ ¤ │ ¤ │ t ©hakibenita.com©hakibenita.com
  • 25. Find the amount of sales during each month SELECT date_trunc('month', sold_at) AS month, sum(price) AS total_sales FROM sale GROUP BY month; month │ total_sales ────────────────────────┼───────────── 2020-04-01 00:00:00-04 │ 37500 2020-03-01 00:00:00-05 │ 26850 ©hakibenita.com©hakibenita.com
  • 26. Find the amount of sales during each month What's the problem? SET TIME ZONE 'America/New_York'; SELECT... month │ total_sales ────────────────────────┼───────────── 2020-04-01 00:00:00-04 │ 37500 2020-03-01 00:00:00-05 │ 26850 SET TIME ZONE 'America/Los_Angeles'; SELECT... month │ total_sales ────────────────────────┼───────────── 2020-04-01 00:00:00-07 │ 17500 2020-03-01 00:00:00-08 │ 46850 ©hakibenita.com©hakibenita.com
  • 27. Date and Times Are Hard! Time Zones Daylight Saving Database types ©hakibenita.com©hakibenita.com
  • 28. Date and Times Are Hard! When truncating timestamps always set a time zone Wrong! Right! date_trunc('month', sold_at) date_trunc('month', sold_at at time zone 'America/New_York') extract('hour' from sold_at) extract('hour' from sold_at at time zone 'America/New_York') date_part('hour', sold_at) date_part('hour', sold_at at time zone 'America/New_York') sold_at::date (sold_at at time zone 'America/New_York')::date '2020-03-22 11:00' '2020-03-22 11:00 America/New_York' ©hakibenita.com©hakibenita.com
  • 29. What is the busiest hour of the day? SELECT extract('hour' from sold_at at time zone 'America/New_York'), COUNT(*) FROM sale GROUP BY 1; hour_of_day | sales ────────────┼─────── 10 | 3 6 | 3 9 | 2 ... Is this correct? ©hakibenita.com©hakibenita.com
  • 30. What is the busiest hour of the day? SELECT extract('hour' FROM sold_at AT TIME ZONE CASE WHEN branch = 'NY' THEN 'America/New_York' WHEN branch = 'LA' THEN 'America/Los_Angeles' END) AS hour_of_day, COUNT(*) AS sales FROM sale GROUP BY hour_of_day ORDER BY sales DESC; hour_of_day | sales ────────────┼─────── 23 | 5 3 | 4 0 | 2 ... ©hakibenita.com©hakibenita.com
  • 31. How many sales were there in March? In America/New_York time SELECT count(*) FROM sale WHERE sold_at BETWEEN '2020-03-01 America/New_York' AND '2020-04-01 America/New_York'; count ─────── 9 ©hakibenita.com©hakibenita.com
  • 32. How many sales were there in April? In America/New_York time SELECT count(*) FROM sale WHERE sold_at BETWEEN '2020-04-01 America/New_York' AND '2020-05-01 America/New_York'; count ─────── 7 Is this correct? ©hakibenita.com©hakibenita.com
  • 33. How many sales are there??? SELECT count(*) FROM sale; count ─────── 15 9 + 7 != 15 ©hakibenita.com©hakibenita.com
  • 34. Between is inclusive One sale is counted twice SELECT id, sold_at FROM sale WHERE sold_at = '2020-04-01 America/New_York'; id │ sold_at ────┼──────────────────────── 2 │ 2020-04-01 00:00:00-04 Sale happened exactly at midnight... ©hakibenita.com©hakibenita.com
  • 35. Between is inclusive Use Half Open Ranges that don’t overlap SELECT count(*) FROM sale WHERE sold_at >= '2020-03-01 America/New_York' AND sold_at < '2020-04-01 America/New_York'; count ─────── 8 It's longer, but safer! ©hakibenita.com©hakibenita.com
  • 36. Between is inclusive Integers and Float: Binning, Dividing to buckets... Dates and Timestamps: Search for values in date range -- Search for 90's babies... Wrong! SELECT * FROM birthdates WHERE birthdate BETWEEN '1990-01-01' AND '2000-01-01'; ©hakibenita.com©hakibenita.com
  • 37. Recap Be careful when dividing integers Dividing by an integer truncates the result Guard against "division by zero" errors Use NULLIF and COALESCE Be careful when aggregating nulls Aggregate functions ignore NULL values Be careful when comparing nulls Use IS DISTINCT FROM to treat NULL like a value Be careful with timestamps Explicitly set a time zone truncating timestamps Between is inclusive Use half open ranges that don’t overlap ©hakibenita.com©hakibenita.com
  • 38. Missed Optimization Opportunities The ways we prevent databases from doing their thing... ©hakibenita.com©hakibenita.com
  • 39. Avoid Transformations on Indexed Fields Transformation can prevent the database from using an index! -- Bad! SELECT * FROM sale WHERE customer = 'Bill'; Execution Time: 4.309 ms -- Good! SELECT * FROM sale WHERE lower(customer) = 'bill'; Execution Time: 34.914 ms ©hakibenita.com©hakibenita.com
  • 40. Avoid Transformations on Indexed Fields Simple arithmetics -- Bad! SELECT * FROM sale WHERE id + 1 = 100; Execution Time: 24.352 ms -- Good! SELECT * FROM sale WHERE id = 100 - 1; Execution Time: 0.045 ms ©hakibenita.com©hakibenita.com
  • 41. Avoid Transformations on Indexed Fields Setting a time zone -- Bad! SELECT * FROM sale WHERE sold_at at time zone 'America/New_York' > '2021-01-01'; Execution Time: 59.117 ms -- Good! SELECT * FROM sale WHERE sold_at > '2021-01-01 America/New_York'; Execution Time: 0.036 ms ©hakibenita.com©hakibenita.com
  • 42. Avoid Transformations on Indexed Fields Date arithmetics -- Bad! SELECT * FROM sale WHERE sold_at - interval '1 day' > '2021-01-01 America/New_York'; Execution Time: 48.077 ms -- Good! SELECT * FROM sale WHERE sold_at > '2021-01-01 America/New_York'::timestamptz + interval '1 day'; Execution Time: 0.023 ms ©hakibenita.com©hakibenita.com
  • 43. Avoid Transformations on Indexed Fields String manipulation -- Bad! SELECT * FROM users WHERE lower(email) = 'me@hakibenita.com'; -- Good! SELECT * FROM users WHERE email = lower('Me@HakiBenita.com'); ©hakibenita.com©hakibenita.com
  • 44. Avoid Transformations on Indexed Fields String concatenation -- Bad! SELECT * FROM users WHERE first_name || ' ' || last_name = 'Haki Benita' -- Good! SELECT * FROM users WHERE first_name = 'Haki' AND last_name = 'Benita' ©hakibenita.com©hakibenita.com
  • 45. UNION vs. UNION ALL Concatenate results SELECT 1 UNION ALL SELECT 1; ?column? ---------- 1 1 Concatenate results and removes duplicates SELECT 1 UNION SELECT 1; ?column? ---------- 1 ©hakibenita.com©hakibenita.com
  • 46. This is it! Check out my blog hakibenita.com Find me on Twitter @be_haki Subscribe to my newsletter at hakibenita.com/subscribe Send me an email me@hakibenita.com ©hakibenita.com©hakibenita.com