SlideShare a Scribd company logo
1 of 16
Download to read offline
Three steps to Untangle
Data Traffic Jams
• Oscar Westra van Holthe – Kind
• Software developer with bol.com since 2012
• You may know me from:
• Connecting retailer via SDD
• Retailer invoicing
• Topspin
• Measurements 2.0
About me
Prerequisites
• Know your data
• Know what you do with it
• Basic understanding of SQL
The problem: data traffic jam
Symptoms:
• Your queries are slow
• Your DB connection times out on the query
• …
Relational Databases
• For OLTP, nothing beats a relational database
• PostgreSQL query planner does an awesome job
• The root cause is always that the DB is doing too much
Tool: the EXPLAIN command
• Tells you how the database will execute your query
• Tells you the associated costs
cost=42.78..812.82 rows=16532 width=32
• start-up cost (the time before the output can begin)
• total cost, assuming the plan node is run to completion,
i.e. there's no limit clause or similar.
• est. number of rows output by this plan node
• average width of rows output by this plan node (in bytes).
• Costs have arbitrary scale, but lower is better
Execution plan: the menu
• Scans:
• Sequential (full table access)
• Index (range)
• Index (range) only
• Joins
• Hash (reads smaller table first, looks up in larger)
• Loops (reads larger table first, looks up in smaller)
• Bitmap (reads index first, then does lookups)
• Merge (zips two large tables after sorting them on join key)
• Sorting
• Mergesort (disk), Quicksort (memory), Heapsort (memory, limit), None
What if EXPLAIN tells you…
Hash Join (cost=22896.89..54208.53 rows=330801 width=1239)
Hash Cond: (order_line.ol_o_id = oorder.o_id)
-> Nested Loop (cost=8853.68..27149.42 rows=32734 width=542)
-> Seq Scan on warehouse (cost=0.00..1.01 rows=1 width=85)
Filter: (w_id = 1)
-> Merge Join (cost=8853.68..26821.07 rows=32734 width=457)
Merge Cond: (order_line.ol_i_id = item.i_id)
-> Merge Join (cost=8852.66..22503.03 rows=32734 width=385)
Merge Cond: (stock.s_i_id = order_line.ol_i_id)
-> Index Scan using pk_stock on stock (cost=0.00..12910.70 rows=100000 width=315)
Index Cond: (s_w_id = 1)
-> Materialize (cost=8852.63..9261.81 rows=32734 width=70)
-> Sort (cost=8852.63..8934.47 rows=32734 width=70)
Sort Key: order_line.ol_i_id
-> Bitmap Heap Scan on order_line (cost=843.82..5053.83 rows=32734 width=70)
Recheck Cond: ((ol_w_id = 1) AND (ol_d_id = 1))
-> Bitmap Index Scan on pk_order_line (cost=0.00..835.64 rows=32734 width=0)
Index Cond: ((ol_w_id = 1) AND (ol_d_id = 1))
-> Index Scan using pk_item on item (cost=0.00..3659.26 rows=100000 width=72)
-> Hash (cost=11040.12..11040.12 rows=29767 width=697)
-> Hash Join (cost=3743.15..11040.12 rows=29767 width=697)
Hash Cond: (oorder.o_d_id = district.d_id)
-> Merge Join (cost=3741.90..10629.58 rows=29767 width=606)
Merge Cond: ((customer.c_d_id = oorder.o_d_id) AND (customer.c_id = oorder.o_c_id))
-> Index Scan using pk_customer on customer (cost=0.00..6215.00 rows=30000 width=564)
Index Cond: (c_w_id = 1)
-> Materialize (cost=3741.90..4116.90 rows=30000 width=42)
-> Sort (cost=3741.90..3816.90 rows=30000 width=42)
Sort Key: oorder.o_d_id, oorder.o_c_id
-> Seq Scan on oorder (cost=0.00..636.00 rows=30000 width=42)
Filter: (o_w_id = 1)
-> Hash (cost=1.12..1.12 rows=10 width=91)
-> Seq Scan on district (cost=0.00..1.12 rows=10 width=91)
Filter: (d_w_id = 1)
Three steps to improve
1. Know your data
2. Know your use cases
3. Tune the query
Know your data
• Design for reading
• Identify immutable data
Know your use cases
• Filter on a minimal number of tables
• Denormalize immutable data to reduce joins
• Make data immutable if appropriate
(and change your use cases accordingly)
Tune your queries
• Create index for every query/table pair
• Sort columns on type of use:
• Filters (where + join) first
• Then group by
• Last order by (if appropriate)
• Notes:
• Smaller indices perform better
(sometimes you actually need near-duplicate indices)
• Too many indices degrade (write) performance
à limit the number of queries if you can
• The best way to optimize queries is to do less
• Know your data & use cases
Takeaway
Resources
• PostgreSQL documentation on “Using EXPLAIN”
https://www.postgresql.org/docs/current/static/using-
explain.html
• In-depth explanation of a single execution plan:
https://robots.thoughtbot.com/reading-an-explain-analyze-
query-plan
Thanks
till next bol.com
Oscar Westra van Holthe - Kind
owestra@bol.com
What if that’s not enough?
• You’ve optimized your use cases
• You’ve optimized your data for read performance
• You’ve optimized your queries
• And it’s still not good enough…
• Then it’s time for a 70’s-era mainframe big data solution
à but that fundamentally changes your use cases!
• Not querying when you need it, but:
• Batching / asynchronous à run query ahead of using its results
• Streaming à query continuously

More Related Content

What's hot

What's hot (20)

Jan vitek distributedrandomforest_5-2-2013
Jan vitek distributedrandomforest_5-2-2013Jan vitek distributedrandomforest_5-2-2013
Jan vitek distributedrandomforest_5-2-2013
 
Chapter2
Chapter2Chapter2
Chapter2
 
Sql query performance analysis
Sql query performance analysisSql query performance analysis
Sql query performance analysis
 
Dynamic memory allocation and linked lists
Dynamic memory allocation and linked listsDynamic memory allocation and linked lists
Dynamic memory allocation and linked lists
 
Advanced Non-Relational Schemas For Big Data
Advanced Non-Relational Schemas For Big DataAdvanced Non-Relational Schemas For Big Data
Advanced Non-Relational Schemas For Big Data
 
Intro to column stores
Intro to column storesIntro to column stores
Intro to column stores
 
Java-7: Collections
Java-7: CollectionsJava-7: Collections
Java-7: Collections
 
Sql Connection and data table and data set and sample program in C# ....
Sql Connection and data table and data set and sample program in C# ....Sql Connection and data table and data set and sample program in C# ....
Sql Connection and data table and data set and sample program in C# ....
 
Hash table
Hash tableHash table
Hash table
 
InfiniFlux Minmax Cache
InfiniFlux Minmax CacheInfiniFlux Minmax Cache
InfiniFlux Minmax Cache
 
4 preprocess
4 preprocess4 preprocess
4 preprocess
 
Pointers in real life
Pointers in real lifePointers in real life
Pointers in real life
 
Data structure
Data structureData structure
Data structure
 
Chapter 1 Getting Started with HTML5
Chapter 1 Getting Started with HTML5Chapter 1 Getting Started with HTML5
Chapter 1 Getting Started with HTML5
 
Mysql Indexing
Mysql IndexingMysql Indexing
Mysql Indexing
 
Chapter 6 Working with Tables and Columns
Chapter 6 Working with Tables and ColumnsChapter 6 Working with Tables and Columns
Chapter 6 Working with Tables and Columns
 
Better design than sorry - let's design our DB schema
Better design than sorry - let's design our DB schemaBetter design than sorry - let's design our DB schema
Better design than sorry - let's design our DB schema
 
4.4 hashing ext
4.4 hashing  ext4.4 hashing  ext
4.4 hashing ext
 
Chapter 10 Exploring arrays, loops, and conditional statements
Chapter 10 Exploring arrays, loops, and conditional statementsChapter 10 Exploring arrays, loops, and conditional statements
Chapter 10 Exploring arrays, loops, and conditional statements
 
Python and CSV Connectivity
Python and CSV ConnectivityPython and CSV Connectivity
Python and CSV Connectivity
 

Similar to Three steps to untangle data traffic jams

AWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentationAWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentation
Volodymyr Rovetskiy
 

Similar to Three steps to untangle data traffic jams (20)

Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
AWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentationAWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentation
 
In memory databases presentation
In memory databases presentationIn memory databases presentation
In memory databases presentation
 
QBIC
QBICQBIC
QBIC
 
Intro_2.ppt
Intro_2.pptIntro_2.ppt
Intro_2.ppt
 
Intro.ppt
Intro.pptIntro.ppt
Intro.ppt
 
Intro.ppt
Intro.pptIntro.ppt
Intro.ppt
 
Best Practices for Migrating Your Data Warehouse to Amazon Redshift
Best Practices for Migrating Your Data Warehouse to Amazon RedshiftBest Practices for Migrating Your Data Warehouse to Amazon Redshift
Best Practices for Migrating Your Data Warehouse to Amazon Redshift
 
Deep Dive into DynamoDB
Deep Dive into DynamoDBDeep Dive into DynamoDB
Deep Dive into DynamoDB
 
Cassandra training
Cassandra trainingCassandra training
Cassandra training
 
Why databases cry at night
Why databases cry at nightWhy databases cry at night
Why databases cry at night
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
Building better SQL Server Databases
Building better SQL Server DatabasesBuilding better SQL Server Databases
Building better SQL Server Databases
 
Data Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftData Warehousing with Amazon Redshift
Data Warehousing with Amazon Redshift
 
L6.sp17.pptx
L6.sp17.pptxL6.sp17.pptx
L6.sp17.pptx
 
Amazon Redshift Deep Dive - February Online Tech Talks
Amazon Redshift Deep Dive - February Online Tech TalksAmazon Redshift Deep Dive - February Online Tech Talks
Amazon Redshift Deep Dive - February Online Tech Talks
 
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon RedshiftData warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
 
Building your data warehouse with Redshift
Building your data warehouse with RedshiftBuilding your data warehouse with Redshift
Building your data warehouse with Redshift
 
Elasticsearch - Scalability and Multitenancy
Elasticsearch - Scalability and MultitenancyElasticsearch - Scalability and Multitenancy
Elasticsearch - Scalability and Multitenancy
 
Practical Large Scale Experiences with Spark 2.0 Machine Learning: Spark Summ...
Practical Large Scale Experiences with Spark 2.0 Machine Learning: Spark Summ...Practical Large Scale Experiences with Spark 2.0 Machine Learning: Spark Summ...
Practical Large Scale Experiences with Spark 2.0 Machine Learning: Spark Summ...
 

More from Bol.com Techlab

More from Bol.com Techlab (20)

The hitchhiker’s guide to Prometheus
The hitchhiker’s guide to PrometheusThe hitchhiker’s guide to Prometheus
The hitchhiker’s guide to Prometheus
 
Test long and prosper
Test long and prosperTest long and prosper
Test long and prosper
 
The Reactive Rollercoaster
The Reactive RollercoasterThe Reactive Rollercoaster
The Reactive Rollercoaster
 
Best painkiller for Java headache
Best painkiller for Java headacheBest painkiller for Java headache
Best painkiller for Java headache
 
Organizing a conference in 80 days
Organizing a conference in 80 daysOrganizing a conference in 80 days
Organizing a conference in 80 days
 
Understanding Operating Systems by breaking them
Understanding Operating Systems by breaking themUnderstanding Operating Systems by breaking them
Understanding Operating Systems by breaking them
 
How to train your dragon
How to train your dragonHow to train your dragon
How to train your dragon
 
The hitchhiker’s guide to Prometheus
The hitchhiker’s guide to PrometheusThe hitchhiker’s guide to Prometheus
The hitchhiker’s guide to Prometheus
 
Software for drafting a cold beer
Software for drafting a cold beerSoftware for drafting a cold beer
Software for drafting a cold beer
 
Going to the cloud: Forget EVERYTHING you know!
Going to the cloud: Forget EVERYTHING you know!Going to the cloud: Forget EVERYTHING you know!
Going to the cloud: Forget EVERYTHING you know!
 
How to create your presentation in an iterative way
How to create your presentation in an iterative wayHow to create your presentation in an iterative way
How to create your presentation in an iterative way
 
Wax on, wax off
Wax on, wax offWax on, wax off
Wax on, wax off
 
Jupyter and Pandas to the rescue!
Jupyter and Pandas to the rescue!Jupyter and Pandas to the rescue!
Jupyter and Pandas to the rescue!
 
How the best of Design and Development come together
How the best of Design and Development come togetherHow the best of Design and Development come together
How the best of Design and Development come together
 
The addition to your team you never knew you needed
The addition to your team you never knew you neededThe addition to your team you never knew you needed
The addition to your team you never knew you needed
 
Gravitational waves: A new era in astronomy
Gravitational waves: A new era in astronomyGravitational waves: A new era in astronomy
Gravitational waves: A new era in astronomy
 
Consumer Driven Contract Testing
Consumer Driven Contract TestingConsumer Driven Contract Testing
Consumer Driven Contract Testing
 
I want to go fast! - Exposing performance bottlenecks
I want to go fast! - Exposing performance bottlenecksI want to go fast! - Exposing performance bottlenecks
I want to go fast! - Exposing performance bottlenecks
 
Kubernetes: love at first sight?
Kubernetes: love at first sight?Kubernetes: love at first sight?
Kubernetes: love at first sight?
 
Blockchain: the magical database in the cloud?
Blockchain: the magical database in the cloud?Blockchain: the magical database in the cloud?
Blockchain: the magical database in the cloud?
 

Recently uploaded

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 

Three steps to untangle data traffic jams

  • 1. Three steps to Untangle Data Traffic Jams
  • 2. • Oscar Westra van Holthe – Kind • Software developer with bol.com since 2012 • You may know me from: • Connecting retailer via SDD • Retailer invoicing • Topspin • Measurements 2.0 About me
  • 3. Prerequisites • Know your data • Know what you do with it • Basic understanding of SQL
  • 4. The problem: data traffic jam Symptoms: • Your queries are slow • Your DB connection times out on the query • …
  • 5. Relational Databases • For OLTP, nothing beats a relational database • PostgreSQL query planner does an awesome job • The root cause is always that the DB is doing too much
  • 6. Tool: the EXPLAIN command • Tells you how the database will execute your query • Tells you the associated costs cost=42.78..812.82 rows=16532 width=32 • start-up cost (the time before the output can begin) • total cost, assuming the plan node is run to completion, i.e. there's no limit clause or similar. • est. number of rows output by this plan node • average width of rows output by this plan node (in bytes). • Costs have arbitrary scale, but lower is better
  • 7. Execution plan: the menu • Scans: • Sequential (full table access) • Index (range) • Index (range) only • Joins • Hash (reads smaller table first, looks up in larger) • Loops (reads larger table first, looks up in smaller) • Bitmap (reads index first, then does lookups) • Merge (zips two large tables after sorting them on join key) • Sorting • Mergesort (disk), Quicksort (memory), Heapsort (memory, limit), None
  • 8. What if EXPLAIN tells you… Hash Join (cost=22896.89..54208.53 rows=330801 width=1239) Hash Cond: (order_line.ol_o_id = oorder.o_id) -> Nested Loop (cost=8853.68..27149.42 rows=32734 width=542) -> Seq Scan on warehouse (cost=0.00..1.01 rows=1 width=85) Filter: (w_id = 1) -> Merge Join (cost=8853.68..26821.07 rows=32734 width=457) Merge Cond: (order_line.ol_i_id = item.i_id) -> Merge Join (cost=8852.66..22503.03 rows=32734 width=385) Merge Cond: (stock.s_i_id = order_line.ol_i_id) -> Index Scan using pk_stock on stock (cost=0.00..12910.70 rows=100000 width=315) Index Cond: (s_w_id = 1) -> Materialize (cost=8852.63..9261.81 rows=32734 width=70) -> Sort (cost=8852.63..8934.47 rows=32734 width=70) Sort Key: order_line.ol_i_id -> Bitmap Heap Scan on order_line (cost=843.82..5053.83 rows=32734 width=70) Recheck Cond: ((ol_w_id = 1) AND (ol_d_id = 1)) -> Bitmap Index Scan on pk_order_line (cost=0.00..835.64 rows=32734 width=0) Index Cond: ((ol_w_id = 1) AND (ol_d_id = 1)) -> Index Scan using pk_item on item (cost=0.00..3659.26 rows=100000 width=72) -> Hash (cost=11040.12..11040.12 rows=29767 width=697) -> Hash Join (cost=3743.15..11040.12 rows=29767 width=697) Hash Cond: (oorder.o_d_id = district.d_id) -> Merge Join (cost=3741.90..10629.58 rows=29767 width=606) Merge Cond: ((customer.c_d_id = oorder.o_d_id) AND (customer.c_id = oorder.o_c_id)) -> Index Scan using pk_customer on customer (cost=0.00..6215.00 rows=30000 width=564) Index Cond: (c_w_id = 1) -> Materialize (cost=3741.90..4116.90 rows=30000 width=42) -> Sort (cost=3741.90..3816.90 rows=30000 width=42) Sort Key: oorder.o_d_id, oorder.o_c_id -> Seq Scan on oorder (cost=0.00..636.00 rows=30000 width=42) Filter: (o_w_id = 1) -> Hash (cost=1.12..1.12 rows=10 width=91) -> Seq Scan on district (cost=0.00..1.12 rows=10 width=91) Filter: (d_w_id = 1)
  • 9. Three steps to improve 1. Know your data 2. Know your use cases 3. Tune the query
  • 10. Know your data • Design for reading • Identify immutable data
  • 11. Know your use cases • Filter on a minimal number of tables • Denormalize immutable data to reduce joins • Make data immutable if appropriate (and change your use cases accordingly)
  • 12. Tune your queries • Create index for every query/table pair • Sort columns on type of use: • Filters (where + join) first • Then group by • Last order by (if appropriate) • Notes: • Smaller indices perform better (sometimes you actually need near-duplicate indices) • Too many indices degrade (write) performance à limit the number of queries if you can
  • 13. • The best way to optimize queries is to do less • Know your data & use cases Takeaway
  • 14. Resources • PostgreSQL documentation on “Using EXPLAIN” https://www.postgresql.org/docs/current/static/using- explain.html • In-depth explanation of a single execution plan: https://robots.thoughtbot.com/reading-an-explain-analyze- query-plan
  • 15. Thanks till next bol.com Oscar Westra van Holthe - Kind owestra@bol.com
  • 16. What if that’s not enough? • You’ve optimized your use cases • You’ve optimized your data for read performance • You’ve optimized your queries • And it’s still not good enough… • Then it’s time for a 70’s-era mainframe big data solution à but that fundamentally changes your use cases! • Not querying when you need it, but: • Batching / asynchronous à run query ahead of using its results • Streaming à query continuously