SlideShare a Scribd company logo
Round Trip Client-Side COPY
for High Volume Postgres
Inserts
DANIEL GERLANC
PRESIDENT & FOUNDER
Enplus Advisors, Inc.
Round Trip Client-Side COPY (RTC COPY)
2
COPY to buffer on client, then COPY into destination
table
Problem
3
Insert large number of records from SELECTed tables into a
logged destination table
Don’t control the schema
If possible, all or nothing
Problem
4
Table(s)-to-Table inserts were not completing!
Solution:
5
1. INSERT in smaller batches
2. Round Trip Client-Side COPY (RTC Copy)
How’d you come up with this?
6
To avoid putting load on a production server I ran some
analyses on my local Postgres instance
COPYed from my local Postgres instance to disk and COPYed
to the remote Postgres
Noticed that I never had problems inserting into the
production database this way
Benchmark
7
Benchmark
8
Always 1..10K2
Insert 1..n11
Start With = count(n_users) * 53
Insert: count(n_users) * 54
T2T INSERT or RTC COPY5
Benchmark
9
1. Drop and recreate all tables except benchmarks result table
2. Populate users
3. Populate categories
4. Populate user_stats with random data for all users, 5 categories
5. Populate user_stats_staging with random data for all users and
additional 5 categories
6. Run vacuum	analyze
7. Copy records from user_stats_staging to user_stats using
INSERT or COPY
Benchmark
10
e.g. Number of Users = 1K
Logged Unlogged
T2T	INSERT
RTC	COPY
30 Repetitions for each condition
n_users	=	(1K,	10K,	50K,	100K,	500K)
1K Users, 5K Existing, 5K Insert
11
Box Plot of 30 Repetitions per Condition
1K Users, 5K Existing, 5K Insert
12
BCa Bootstrap Confidence Intervals for
Mean Difference, 2,000 Replications*
*Calculated with R	3.4.2 and bootES	1.2
10K Users, 50K Existing, 50K Insert
13
Box Plot of 30 Repetitions per Condition
10K Users, 50K Existing, 50K Insert
14
BCa Bootstrap Confidence Intervals for
Mean Difference, 2,000 Replications*
*Calculated with R	3.4.2 and bootES	1.2
50K Users, 250K Existing, 250K Insert
15
Box Plot of 30 Repetitions per Condition
50K Users, 250K Existing, 250K Insert
16
BCa Bootstrap Confidence Intervals
for Mean Difference, 2,000 Replications*
*Calculated with R	3.4.2 and bootES	1.2
100K Users, 500K Existing, 500K Insert
17
Box Plot of 30 Repetitions per Condition
100K Users, 500K Existing, 500K Insert
18
BCa Bootstrap Confidence Intervals
for Mean Difference, 2,000 Replications*
*Calculated with R	3.4.2 and bootES	1.2
500K Users, 2.5M Existing, 2.5M Insert
19
Box Plot of 30 Repetitions per Condition
500K Users, 2.5M Existing, 2.5M Insert
20
BCa Bootstrap Confidence Intervals
for Mean Difference, 2,000 Replications*
*Calculated with R	3.4.2 and bootES	1.2
Is the difference meaningful?
21
Mean and BCa Confidence Intervals, 2,000 Replications
Statistically Significant Results
22
5K Records: No statistically significant difference
50K Records: Logged COPY faster
100K Records: Unlogged COPY faster
500K Records: Unlogged INSERT faster
Other differences not statistically significant.
Questions?
CONTACT
DANIEL GERLANC
ENPLUS ADVISORS, INC.
dgerlanc@enplusadvisors.com
https://github.com/dgerlanc/pyrtcbench

More Related Content

Similar to Round Trip Client-Side COPY for High Volume Postgres Inserts

8 Channel Bi Directional Logic Level Converter
8 Channel Bi Directional Logic Level Converter8 Channel Bi Directional Logic Level Converter
8 Channel Bi Directional Logic Level Converter
Raghav Shetty
 
How to build TiDB
How to build TiDBHow to build TiDB
How to build TiDB
PingCAP
 
MySQL Optimizer Overview
MySQL Optimizer OverviewMySQL Optimizer Overview
MySQL Optimizer Overview
Olav Sandstå
 
Sp chap2
Sp chap2Sp chap2
Logical Replication in PostgreSQL
Logical Replication in PostgreSQLLogical Replication in PostgreSQL
Logical Replication in PostgreSQL
EDB
 
DigSILENT PF - 06 irena additional exercises
DigSILENT PF - 06 irena  additional exercisesDigSILENT PF - 06 irena  additional exercises
DigSILENT PF - 06 irena additional exercises
Himmelstern
 
Making a peaking filter by Julio Marqués
Making a peaking filter by Julio MarquésMaking a peaking filter by Julio Marqués
Making a peaking filter by Julio Marqués
Julio José Marqués Emán
 
Rtl design optimizations and tradeoffs
Rtl design optimizations and tradeoffsRtl design optimizations and tradeoffs
Rtl design optimizations and tradeoffs
Grace Abraham
 
Ecet 330 Enthusiastic Study / snaptutorial.com
Ecet 330 Enthusiastic Study / snaptutorial.comEcet 330 Enthusiastic Study / snaptutorial.com
Ecet 330 Enthusiastic Study / snaptutorial.com
Stephenson033
 
ECET 330 Technology levels--snaptutorial.com
ECET 330 Technology levels--snaptutorial.comECET 330 Technology levels--snaptutorial.com
ECET 330 Technology levels--snaptutorial.com
sholingarjosh102
 
ECET 330 Massive Success--snaptutorial.com
ECET 330 Massive Success--snaptutorial.comECET 330 Massive Success--snaptutorial.com
ECET 330 Massive Success--snaptutorial.com
santricksapiens71
 
Ecet 330 Success Begins / snaptutorial.com
Ecet 330 Success Begins / snaptutorial.comEcet 330 Success Begins / snaptutorial.com
Ecet 330 Success Begins / snaptutorial.com
WilliamsTaylorzm
 
High_Speed_TCP_for_Large_Congestion_Windows.pdf
High_Speed_TCP_for_Large_Congestion_Windows.pdfHigh_Speed_TCP_for_Large_Congestion_Windows.pdf
High_Speed_TCP_for_Large_Congestion_Windows.pdf
SHIKHAARYA26
 
High_Speed_TCP_for_Large_Congestion_Windows.pdf
High_Speed_TCP_for_Large_Congestion_Windows.pdfHigh_Speed_TCP_for_Large_Congestion_Windows.pdf
High_Speed_TCP_for_Large_Congestion_Windows.pdf
SHIKHAARYA26
 
Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~
Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~
Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~
Insight Technology, Inc.
 
[Altibase] 9 replication part2 (methods and controls)
[Altibase] 9 replication part2 (methods and controls)[Altibase] 9 replication part2 (methods and controls)
[Altibase] 9 replication part2 (methods and controls)
altistory
 
EE452_Flyback Convert
EE452_Flyback ConvertEE452_Flyback Convert
EE452_Flyback Convert
ki hei chan
 
V.6 CSPro Tabulation Application_Creating Tables with PostCalc Application.pptx
V.6 CSPro Tabulation Application_Creating Tables with PostCalc Application.pptxV.6 CSPro Tabulation Application_Creating Tables with PostCalc Application.pptx
V.6 CSPro Tabulation Application_Creating Tables with PostCalc Application.pptx
EmmanuelAzuela3
 
IRJET- Design and Simulation of 12-Bit Current Steering DAC
IRJET-  	  Design and Simulation of 12-Bit Current Steering DACIRJET-  	  Design and Simulation of 12-Bit Current Steering DAC
IRJET- Design and Simulation of 12-Bit Current Steering DAC
IRJET Journal
 
ECET 340 Effective Communication/tutorialrank.com
 ECET 340 Effective Communication/tutorialrank.com ECET 340 Effective Communication/tutorialrank.com
ECET 340 Effective Communication/tutorialrank.com
jonhson203
 

Similar to Round Trip Client-Side COPY for High Volume Postgres Inserts (20)

8 Channel Bi Directional Logic Level Converter
8 Channel Bi Directional Logic Level Converter8 Channel Bi Directional Logic Level Converter
8 Channel Bi Directional Logic Level Converter
 
How to build TiDB
How to build TiDBHow to build TiDB
How to build TiDB
 
MySQL Optimizer Overview
MySQL Optimizer OverviewMySQL Optimizer Overview
MySQL Optimizer Overview
 
Sp chap2
Sp chap2Sp chap2
Sp chap2
 
Logical Replication in PostgreSQL
Logical Replication in PostgreSQLLogical Replication in PostgreSQL
Logical Replication in PostgreSQL
 
DigSILENT PF - 06 irena additional exercises
DigSILENT PF - 06 irena  additional exercisesDigSILENT PF - 06 irena  additional exercises
DigSILENT PF - 06 irena additional exercises
 
Making a peaking filter by Julio Marqués
Making a peaking filter by Julio MarquésMaking a peaking filter by Julio Marqués
Making a peaking filter by Julio Marqués
 
Rtl design optimizations and tradeoffs
Rtl design optimizations and tradeoffsRtl design optimizations and tradeoffs
Rtl design optimizations and tradeoffs
 
Ecet 330 Enthusiastic Study / snaptutorial.com
Ecet 330 Enthusiastic Study / snaptutorial.comEcet 330 Enthusiastic Study / snaptutorial.com
Ecet 330 Enthusiastic Study / snaptutorial.com
 
ECET 330 Technology levels--snaptutorial.com
ECET 330 Technology levels--snaptutorial.comECET 330 Technology levels--snaptutorial.com
ECET 330 Technology levels--snaptutorial.com
 
ECET 330 Massive Success--snaptutorial.com
ECET 330 Massive Success--snaptutorial.comECET 330 Massive Success--snaptutorial.com
ECET 330 Massive Success--snaptutorial.com
 
Ecet 330 Success Begins / snaptutorial.com
Ecet 330 Success Begins / snaptutorial.comEcet 330 Success Begins / snaptutorial.com
Ecet 330 Success Begins / snaptutorial.com
 
High_Speed_TCP_for_Large_Congestion_Windows.pdf
High_Speed_TCP_for_Large_Congestion_Windows.pdfHigh_Speed_TCP_for_Large_Congestion_Windows.pdf
High_Speed_TCP_for_Large_Congestion_Windows.pdf
 
High_Speed_TCP_for_Large_Congestion_Windows.pdf
High_Speed_TCP_for_Large_Congestion_Windows.pdfHigh_Speed_TCP_for_Large_Congestion_Windows.pdf
High_Speed_TCP_for_Large_Congestion_Windows.pdf
 
Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~
Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~
Great performance at scale~次期PostgreSQL12のパーティショニング性能の実力に迫る~
 
[Altibase] 9 replication part2 (methods and controls)
[Altibase] 9 replication part2 (methods and controls)[Altibase] 9 replication part2 (methods and controls)
[Altibase] 9 replication part2 (methods and controls)
 
EE452_Flyback Convert
EE452_Flyback ConvertEE452_Flyback Convert
EE452_Flyback Convert
 
V.6 CSPro Tabulation Application_Creating Tables with PostCalc Application.pptx
V.6 CSPro Tabulation Application_Creating Tables with PostCalc Application.pptxV.6 CSPro Tabulation Application_Creating Tables with PostCalc Application.pptx
V.6 CSPro Tabulation Application_Creating Tables with PostCalc Application.pptx
 
IRJET- Design and Simulation of 12-Bit Current Steering DAC
IRJET-  	  Design and Simulation of 12-Bit Current Steering DACIRJET-  	  Design and Simulation of 12-Bit Current Steering DAC
IRJET- Design and Simulation of 12-Bit Current Steering DAC
 
ECET 340 Effective Communication/tutorialrank.com
 ECET 340 Effective Communication/tutorialrank.com ECET 340 Effective Communication/tutorialrank.com
ECET 340 Effective Communication/tutorialrank.com
 

Recently uploaded

如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
gapen1
 
Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptxMigration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
ervikas4
 
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdfBaha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid
 
Kubernetes at Scale: Going Multi-Cluster with Istio
Kubernetes at Scale:  Going Multi-Cluster  with IstioKubernetes at Scale:  Going Multi-Cluster  with Istio
Kubernetes at Scale: Going Multi-Cluster with Istio
Severalnines
 
Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
ICS
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
Green Software Development
 
Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !
Marcin Chrost
 
The Rising Future of CPaaS in the Middle East 2024
The Rising Future of CPaaS in the Middle East 2024The Rising Future of CPaaS in the Middle East 2024
The Rising Future of CPaaS in the Middle East 2024
Yara Milbes
 
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
The Third Creative Media
 
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesE-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
Quickdice ERP
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
Peter Muessig
 
Malibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed RoundMalibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed Round
sjcobrien
 
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom KittEnhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Peter Caitens
 
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdfTop Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
VALiNTRY360
 
ppt on the brain chip neuralink.pptx
ppt  on   the brain  chip neuralink.pptxppt  on   the brain  chip neuralink.pptx
ppt on the brain chip neuralink.pptx
Reetu63
 
Modelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - AmsterdamModelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - Amsterdam
Alberto Brandolini
 
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
kalichargn70th171
 
WWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders AustinWWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders Austin
Patrick Weigel
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
Green Software Development
 
INTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLES
INTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLESINTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLES
INTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLES
anfaltahir1010
 

Recently uploaded (20)

如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
 
Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptxMigration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
 
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdfBaha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
 
Kubernetes at Scale: Going Multi-Cluster with Istio
Kubernetes at Scale:  Going Multi-Cluster  with IstioKubernetes at Scale:  Going Multi-Cluster  with Istio
Kubernetes at Scale: Going Multi-Cluster with Istio
 
Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
 
Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !
 
The Rising Future of CPaaS in the Middle East 2024
The Rising Future of CPaaS in the Middle East 2024The Rising Future of CPaaS in the Middle East 2024
The Rising Future of CPaaS in the Middle East 2024
 
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
 
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesE-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
 
Malibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed RoundMalibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed Round
 
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom KittEnhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
 
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdfTop Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
 
ppt on the brain chip neuralink.pptx
ppt  on   the brain  chip neuralink.pptxppt  on   the brain  chip neuralink.pptx
ppt on the brain chip neuralink.pptx
 
Modelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - AmsterdamModelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - Amsterdam
 
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
 
WWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders AustinWWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders Austin
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
 
INTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLES
INTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLESINTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLES
INTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLES
 

Round Trip Client-Side COPY for High Volume Postgres Inserts

  • 1. Round Trip Client-Side COPY for High Volume Postgres Inserts DANIEL GERLANC PRESIDENT & FOUNDER Enplus Advisors, Inc.
  • 2. Round Trip Client-Side COPY (RTC COPY) 2 COPY to buffer on client, then COPY into destination table
  • 3. Problem 3 Insert large number of records from SELECTed tables into a logged destination table Don’t control the schema If possible, all or nothing
  • 5. Solution: 5 1. INSERT in smaller batches 2. Round Trip Client-Side COPY (RTC Copy)
  • 6. How’d you come up with this? 6 To avoid putting load on a production server I ran some analyses on my local Postgres instance COPYed from my local Postgres instance to disk and COPYed to the remote Postgres Noticed that I never had problems inserting into the production database this way
  • 8. Benchmark 8 Always 1..10K2 Insert 1..n11 Start With = count(n_users) * 53 Insert: count(n_users) * 54 T2T INSERT or RTC COPY5
  • 9. Benchmark 9 1. Drop and recreate all tables except benchmarks result table 2. Populate users 3. Populate categories 4. Populate user_stats with random data for all users, 5 categories 5. Populate user_stats_staging with random data for all users and additional 5 categories 6. Run vacuum analyze 7. Copy records from user_stats_staging to user_stats using INSERT or COPY
  • 10. Benchmark 10 e.g. Number of Users = 1K Logged Unlogged T2T INSERT RTC COPY 30 Repetitions for each condition n_users = (1K, 10K, 50K, 100K, 500K)
  • 11. 1K Users, 5K Existing, 5K Insert 11 Box Plot of 30 Repetitions per Condition
  • 12. 1K Users, 5K Existing, 5K Insert 12 BCa Bootstrap Confidence Intervals for Mean Difference, 2,000 Replications* *Calculated with R 3.4.2 and bootES 1.2
  • 13. 10K Users, 50K Existing, 50K Insert 13 Box Plot of 30 Repetitions per Condition
  • 14. 10K Users, 50K Existing, 50K Insert 14 BCa Bootstrap Confidence Intervals for Mean Difference, 2,000 Replications* *Calculated with R 3.4.2 and bootES 1.2
  • 15. 50K Users, 250K Existing, 250K Insert 15 Box Plot of 30 Repetitions per Condition
  • 16. 50K Users, 250K Existing, 250K Insert 16 BCa Bootstrap Confidence Intervals for Mean Difference, 2,000 Replications* *Calculated with R 3.4.2 and bootES 1.2
  • 17. 100K Users, 500K Existing, 500K Insert 17 Box Plot of 30 Repetitions per Condition
  • 18. 100K Users, 500K Existing, 500K Insert 18 BCa Bootstrap Confidence Intervals for Mean Difference, 2,000 Replications* *Calculated with R 3.4.2 and bootES 1.2
  • 19. 500K Users, 2.5M Existing, 2.5M Insert 19 Box Plot of 30 Repetitions per Condition
  • 20. 500K Users, 2.5M Existing, 2.5M Insert 20 BCa Bootstrap Confidence Intervals for Mean Difference, 2,000 Replications* *Calculated with R 3.4.2 and bootES 1.2
  • 21. Is the difference meaningful? 21 Mean and BCa Confidence Intervals, 2,000 Replications
  • 22. Statistically Significant Results 22 5K Records: No statistically significant difference 50K Records: Logged COPY faster 100K Records: Unlogged COPY faster 500K Records: Unlogged INSERT faster Other differences not statistically significant.
  • 23. Questions? CONTACT DANIEL GERLANC ENPLUS ADVISORS, INC. dgerlanc@enplusadvisors.com https://github.com/dgerlanc/pyrtcbench