SlideShare a Scribd company logo
1 of 34
Download to read offline
Slide 1

Bigger data with
PostgreSQL 9
Datawarehousing in the 21st century.

This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.

© by Numius nv

Open systems, Smarter people
Slide 2

The presenter..

• Bert Desmet
• Consultant @ Deloitte
• System Engineer / DBA for deloitteanalytics.eu
• 'devop'?

© by Numius nv

Open systems, Smarter people
Slide 3

agenda

• Introduction
• Release the elephants!
• Impacting factors
• Divide et impera
• Basic configuration
• Passing the speed limits
• Keep your database fit

© by Numius nv

Open systems, Smarter people
Slide 4

Big data?

●

44x data growth per year!
●

●

80% of data is unstructured
●

●

About 35.2 zettabyte by 2020

The volume will grow by a whopping 650% in the next 5years

80% of organisations will use cloud analytics
●

By 2014 80% of eneterprises will want a saas based bi system

© by Numius nv

Open systems, Smarter people
Slide 5

Know your limits
●

DB2

●

More load

●

Scaling
●

●

●

Speed
Data size

Pricing

© by Numius nv

Open systems, Smarter people
Slide 6

Release the elephants!

6

Footer

© by Numius nv

Open systems, Smarter people
Slide 7

PostgreSQL 9
●

Good for big databases

●

Easy maintenance

●

Scales!

●

Very fast

●

Extendable

© by Numius nv

Open systems, Smarter people
Impacting factors
Slide 9

Higly impacting operations

• Dataload
• In bulk (ETL)
• Row by row. Up to 100k rows / minute

• Datafetch (Reporting)
• We do like joins. The more the better.

© by Numius nv

Open systems, Smarter people
Slide 10

Extra problems

• a lot of I/O
• A lot of cpu power (index creation)
• A lot of locks

© by Numius nv

Open systems, Smarter people
Slide 11

The solution?

• Use at least 2 servers
• Set up binary replication
• Put a lot of ram in your servers.

© by Numius nv

Open systems, Smarter people
Slide 12

Dataflow

© by Numius nv

Open systems, Smarter people
Slide 13

Devide et Impera

13

Footer

© by Numius nv

Open systems, Smarter people
Slide 14

Replication with postgres

• 8.3 Warm Standby
• 9.0 Async. Binary Replication
• 9.1 Synchronous Replication
• 9.2 Cascading Replication
• 9.3 more improvents towards fail overs / switching masters
• 9.4 Multimaster Binary Replication?

© by Numius nv

Open systems, Smarter people
Slide 15

Configure replication

• Wal_level = ‘host standby’
• Checkpoint_segments >= 32
• Checkpoint_completetion_target >= 0.8
• Hot_standby = on
• Hot_standby_feedback = on

© by Numius nv

Open systems, Smarter people
Slide 16

© by Numius nv

Open systems, Smarter people
Slide 17

Keep it simple, stupid

• 2nd quadrant is pretty awesome
• Barman for backups
• Repmgr for replication management

© by Numius nv

Open systems, Smarter people
Slide 18

Basic configuration

© by Numius nv

Open systems, Smarter people
Slide 19

Raise those memory limits!

• shared_buffers = 1/8 to ¼ of RAM
• work_mem = 128MB to 1GB
• maintenance_work_mem = 512MB to 1GB
• temp_buffers = 128MB to 1GB
• effective_cache_size = ¾ of RAM
• wal_buffers = 32MB

© by Numius nv

Open systems, Smarter people
Slide 20

Tune the planner for correct planning

• Random_page_cost = 3
• Cpu_tuple_cost = 0.1
• Contraint_exclusion=on
• From_collapse_limit => 12
• Join_collapse_limit => 12

© by Numius nv

Open systems, Smarter people
Slide 21

Passing the speed limits

© by Numius nv

Open systems, Smarter people
Slide 22

Use partitions

• Think about the partition key!
• Trigger based for row / row inserts
• Rule based for bulk inserts
• Make sure you add constraints

© by Numius nv

Open systems, Smarter people
Slide 23

Use indexes

• Learn to read query explains
• Use http://explain.depesz.com/
• Don’t over index

© by Numius nv

Open systems, Smarter people
Slide 24

Other sane things to do

• Use unique indexes
• Auto created when defining a primary key

• Use clustered indexes
• And cluster those tables regularly

© by Numius nv

Open systems, Smarter people
Slide 25

Use partial indexes

• Can only be found in Postgres and Mysql.
• Really usefull on big tables
• Disadvantage: no ‘moving’ indexes. Eg: index for current_day.

© by Numius nv

Open systems, Smarter people
Keep your database fit
Slide 27

Vacuum

• Disable autovacuum for datawarehouses
• Vacuum once a day
• Check regulary if the vacuums to run!
• Prevents data loss
• Prevents the database to go out of control, size wise

© by Numius nv

Open systems, Smarter people
Slide 28

Analyze

• Analyze once a day
• Together with vacuum
• Vacuum analyze <schema>.<table>;

• ‘default_statistics_target’ >= 300

© by Numius nv

Open systems, Smarter people
Slide 29

Check for bloat!

• Free space on tables.
• Indexes are not optimized anymore
• use nagios check_postgres.pl

© by Numius nv

Open systems, Smarter people
Slide 30

Prevent bloat

• Vacuum full
• Offline!
• Only when a pk is not available

• Repack
• Online!
• Orders the tables (clustered index)
• Needs a pk on the table

• Reindex
• Reindex regulary.

© by Numius nv

Open systems, Smarter people
Slide 31

Partial indexes?

• Write a script
• Use a cronjob
• Recreate your time-aware indexes every day. Will be fast.

© by Numius nv

Open systems, Smarter people
Slide 32

© by Numius nv

Open systems, Smarter people
Slide 33

Questions?

• Postgres has an awesome community ®
• Irc: #postgresql @ freenode
• Check the mailing list

© by Numius nv

Open systems, Smarter people
Slide 34

© by Numius nv

Open systems, Smarter people

More Related Content

Viewers also liked

Tales from production with postgreSQL at scale
Tales from production with postgreSQL at scaleTales from production with postgreSQL at scale
Tales from production with postgreSQL at scaleSoumya Ranjan Subudhi
 
Monitoring pg with_graphite_grafana
Monitoring pg with_graphite_grafanaMonitoring pg with_graphite_grafana
Monitoring pg with_graphite_grafanaJan Wieck
 
Table partitioning in PostgreSQL + Rails
Table partitioning in PostgreSQL + RailsTable partitioning in PostgreSQL + Rails
Table partitioning in PostgreSQL + RailsAgnieszka Figiel
 
Oracle Health Check
Oracle Health CheckOracle Health Check
Oracle Health CheckDinesh Gupta
 
Time Series Database and Tick Stack
Time Series Database and Tick StackTime Series Database and Tick Stack
Time Series Database and Tick StackGianluca Arbezzano
 
Big Bad PostgreSQL @ Percona
Big Bad PostgreSQL @ PerconaBig Bad PostgreSQL @ Percona
Big Bad PostgreSQL @ PerconaTheo Schlossnagle
 
Big Data and PostgreSQL
Big Data and PostgreSQLBig Data and PostgreSQL
Big Data and PostgreSQLPGConf APAC
 
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).Alexey Lesovsky
 
Linux Performance Tools
Linux Performance ToolsLinux Performance Tools
Linux Performance ToolsBrendan Gregg
 
The Magic of Tuning in PostgreSQL
The Magic of Tuning in PostgreSQLThe Magic of Tuning in PostgreSQL
The Magic of Tuning in PostgreSQLAshnikbiz
 
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015PostgreSQL-Consulting
 
EDB Postgres DBA Best Practices
EDB Postgres DBA Best PracticesEDB Postgres DBA Best Practices
EDB Postgres DBA Best PracticesEDB
 
What's New in PostgreSQL 9.6
What's New in PostgreSQL 9.6What's New in PostgreSQL 9.6
What's New in PostgreSQL 9.6EDB
 
5 Postgres DBA Tips
5 Postgres DBA Tips5 Postgres DBA Tips
5 Postgres DBA TipsEDB
 
Streaming replication in practice
Streaming replication in practiceStreaming replication in practice
Streaming replication in practiceAlexey Lesovsky
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data WarehousingEdureka!
 
Linux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performanceLinux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performancePostgreSQL-Consulting
 

Viewers also liked (20)

Tales from production with postgreSQL at scale
Tales from production with postgreSQL at scaleTales from production with postgreSQL at scale
Tales from production with postgreSQL at scale
 
Monitoring pg with_graphite_grafana
Monitoring pg with_graphite_grafanaMonitoring pg with_graphite_grafana
Monitoring pg with_graphite_grafana
 
Table partitioning in PostgreSQL + Rails
Table partitioning in PostgreSQL + RailsTable partitioning in PostgreSQL + Rails
Table partitioning in PostgreSQL + Rails
 
Database Health Check
Database Health CheckDatabase Health Check
Database Health Check
 
Oracle Health Check
Oracle Health CheckOracle Health Check
Oracle Health Check
 
Shootout at the PAAS Corral
Shootout at the PAAS CorralShootout at the PAAS Corral
Shootout at the PAAS Corral
 
Case Studies on PostgreSQL
Case Studies on PostgreSQLCase Studies on PostgreSQL
Case Studies on PostgreSQL
 
Time Series Database and Tick Stack
Time Series Database and Tick StackTime Series Database and Tick Stack
Time Series Database and Tick Stack
 
Big Bad PostgreSQL @ Percona
Big Bad PostgreSQL @ PerconaBig Bad PostgreSQL @ Percona
Big Bad PostgreSQL @ Percona
 
Big Data and PostgreSQL
Big Data and PostgreSQLBig Data and PostgreSQL
Big Data and PostgreSQL
 
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
 
Linux Performance Tools
Linux Performance ToolsLinux Performance Tools
Linux Performance Tools
 
The Magic of Tuning in PostgreSQL
The Magic of Tuning in PostgreSQLThe Magic of Tuning in PostgreSQL
The Magic of Tuning in PostgreSQL
 
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
 
EDB Postgres DBA Best Practices
EDB Postgres DBA Best PracticesEDB Postgres DBA Best Practices
EDB Postgres DBA Best Practices
 
What's New in PostgreSQL 9.6
What's New in PostgreSQL 9.6What's New in PostgreSQL 9.6
What's New in PostgreSQL 9.6
 
5 Postgres DBA Tips
5 Postgres DBA Tips5 Postgres DBA Tips
5 Postgres DBA Tips
 
Streaming replication in practice
Streaming replication in practiceStreaming replication in practice
Streaming replication in practice
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 
Linux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performanceLinux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performance
 

More from Bert Desmet

Scaling the cloud
Scaling the cloudScaling the cloud
Scaling the cloudBert Desmet
 
Security, you are also part of the game
Security, you are also part of the gameSecurity, you are also part of the game
Security, you are also part of the gameBert Desmet
 
How to gain karma
How to gain karmaHow to gain karma
How to gain karmaBert Desmet
 
Fedora 14 overview
Fedora 14 overviewFedora 14 overview
Fedora 14 overviewBert Desmet
 
Contribute or die
Contribute or dieContribute or die
Contribute or dieBert Desmet
 
How to live with SELinux
How to live with SELinuxHow to live with SELinux
How to live with SELinuxBert Desmet
 
Start hacking already
Start hacking alreadyStart hacking already
Start hacking alreadyBert Desmet
 

More from Bert Desmet (8)

Scaling the cloud
Scaling the cloudScaling the cloud
Scaling the cloud
 
Security, you are also part of the game
Security, you are also part of the gameSecurity, you are also part of the game
Security, you are also part of the game
 
How to gain karma
How to gain karmaHow to gain karma
How to gain karma
 
Fedora 14 overview
Fedora 14 overviewFedora 14 overview
Fedora 14 overview
 
Contribute or die
Contribute or dieContribute or die
Contribute or die
 
How to live with SELinux
How to live with SELinuxHow to live with SELinux
How to live with SELinux
 
Kvm
KvmKvm
Kvm
 
Start hacking already
Start hacking alreadyStart hacking already
Start hacking already
 

Recently uploaded

Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 

Recently uploaded (20)

Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 

Bigger data with PostgreSQL 9

  • 1. Slide 1 Bigger data with PostgreSQL 9 Datawarehousing in the 21st century. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License. © by Numius nv Open systems, Smarter people
  • 2. Slide 2 The presenter.. • Bert Desmet • Consultant @ Deloitte • System Engineer / DBA for deloitteanalytics.eu • 'devop'? © by Numius nv Open systems, Smarter people
  • 3. Slide 3 agenda • Introduction • Release the elephants! • Impacting factors • Divide et impera • Basic configuration • Passing the speed limits • Keep your database fit © by Numius nv Open systems, Smarter people
  • 4. Slide 4 Big data? ● 44x data growth per year! ● ● 80% of data is unstructured ● ● About 35.2 zettabyte by 2020 The volume will grow by a whopping 650% in the next 5years 80% of organisations will use cloud analytics ● By 2014 80% of eneterprises will want a saas based bi system © by Numius nv Open systems, Smarter people
  • 5. Slide 5 Know your limits ● DB2 ● More load ● Scaling ● ● ● Speed Data size Pricing © by Numius nv Open systems, Smarter people
  • 6. Slide 6 Release the elephants! 6 Footer © by Numius nv Open systems, Smarter people
  • 7. Slide 7 PostgreSQL 9 ● Good for big databases ● Easy maintenance ● Scales! ● Very fast ● Extendable © by Numius nv Open systems, Smarter people
  • 9. Slide 9 Higly impacting operations • Dataload • In bulk (ETL) • Row by row. Up to 100k rows / minute • Datafetch (Reporting) • We do like joins. The more the better. © by Numius nv Open systems, Smarter people
  • 10. Slide 10 Extra problems • a lot of I/O • A lot of cpu power (index creation) • A lot of locks © by Numius nv Open systems, Smarter people
  • 11. Slide 11 The solution? • Use at least 2 servers • Set up binary replication • Put a lot of ram in your servers. © by Numius nv Open systems, Smarter people
  • 12. Slide 12 Dataflow © by Numius nv Open systems, Smarter people
  • 13. Slide 13 Devide et Impera 13 Footer © by Numius nv Open systems, Smarter people
  • 14. Slide 14 Replication with postgres • 8.3 Warm Standby • 9.0 Async. Binary Replication • 9.1 Synchronous Replication • 9.2 Cascading Replication • 9.3 more improvents towards fail overs / switching masters • 9.4 Multimaster Binary Replication? © by Numius nv Open systems, Smarter people
  • 15. Slide 15 Configure replication • Wal_level = ‘host standby’ • Checkpoint_segments >= 32 • Checkpoint_completetion_target >= 0.8 • Hot_standby = on • Hot_standby_feedback = on © by Numius nv Open systems, Smarter people
  • 16. Slide 16 © by Numius nv Open systems, Smarter people
  • 17. Slide 17 Keep it simple, stupid • 2nd quadrant is pretty awesome • Barman for backups • Repmgr for replication management © by Numius nv Open systems, Smarter people
  • 18. Slide 18 Basic configuration © by Numius nv Open systems, Smarter people
  • 19. Slide 19 Raise those memory limits! • shared_buffers = 1/8 to ¼ of RAM • work_mem = 128MB to 1GB • maintenance_work_mem = 512MB to 1GB • temp_buffers = 128MB to 1GB • effective_cache_size = ¾ of RAM • wal_buffers = 32MB © by Numius nv Open systems, Smarter people
  • 20. Slide 20 Tune the planner for correct planning • Random_page_cost = 3 • Cpu_tuple_cost = 0.1 • Contraint_exclusion=on • From_collapse_limit => 12 • Join_collapse_limit => 12 © by Numius nv Open systems, Smarter people
  • 21. Slide 21 Passing the speed limits © by Numius nv Open systems, Smarter people
  • 22. Slide 22 Use partitions • Think about the partition key! • Trigger based for row / row inserts • Rule based for bulk inserts • Make sure you add constraints © by Numius nv Open systems, Smarter people
  • 23. Slide 23 Use indexes • Learn to read query explains • Use http://explain.depesz.com/ • Don’t over index © by Numius nv Open systems, Smarter people
  • 24. Slide 24 Other sane things to do • Use unique indexes • Auto created when defining a primary key • Use clustered indexes • And cluster those tables regularly © by Numius nv Open systems, Smarter people
  • 25. Slide 25 Use partial indexes • Can only be found in Postgres and Mysql. • Really usefull on big tables • Disadvantage: no ‘moving’ indexes. Eg: index for current_day. © by Numius nv Open systems, Smarter people
  • 27. Slide 27 Vacuum • Disable autovacuum for datawarehouses • Vacuum once a day • Check regulary if the vacuums to run! • Prevents data loss • Prevents the database to go out of control, size wise © by Numius nv Open systems, Smarter people
  • 28. Slide 28 Analyze • Analyze once a day • Together with vacuum • Vacuum analyze <schema>.<table>; • ‘default_statistics_target’ >= 300 © by Numius nv Open systems, Smarter people
  • 29. Slide 29 Check for bloat! • Free space on tables. • Indexes are not optimized anymore • use nagios check_postgres.pl © by Numius nv Open systems, Smarter people
  • 30. Slide 30 Prevent bloat • Vacuum full • Offline! • Only when a pk is not available • Repack • Online! • Orders the tables (clustered index) • Needs a pk on the table • Reindex • Reindex regulary. © by Numius nv Open systems, Smarter people
  • 31. Slide 31 Partial indexes? • Write a script • Use a cronjob • Recreate your time-aware indexes every day. Will be fast. © by Numius nv Open systems, Smarter people
  • 32. Slide 32 © by Numius nv Open systems, Smarter people
  • 33. Slide 33 Questions? • Postgres has an awesome community ® • Irc: #postgresql @ freenode • Check the mailing list © by Numius nv Open systems, Smarter people
  • 34. Slide 34 © by Numius nv Open systems, Smarter people