The document provides an overview of five steps to optimize PostgreSQL performance: 1) application design, 2) query tuning, 3) hardware/OS configuration, 4) PostgreSQL configuration, and 5) caching. It discusses best practices for schema design, indexing, queries, transactions, and connection management to improve performance. Key recommendations include normalizing schemas, indexing commonly used columns, batching queries and transactions, using prepared statements, and implementing caching at multiple levels.
This talk is from Distributed Data Summit SF 2018 - http://distributeddatasummit.com/2018-sf/sessions#chella
Audit logging is one of the most critical features in an enterprise-ready database in terms of security compliance. Furthermore, live traffic troubleshooting is critical for operators to troubleshoot production issues quickly. While past versions have lacked these critical features, the Cassandra team understood the need for better solutions and in the upcoming release of Cassandra both of these features now come out of the box which makes Cassandra even more awesome to work with. Cassandra now supports Audit logging and query logging as part of C* itself. As part of this talk, audience will learn about how to enable, configure, and tune audit logging for their C* clusters and how to log live traffic/queries for serverel needs including troubleshooting or even live traffic reply
Announcing Amazon Aurora with PostgreSQL Compatibility - January 2017 AWS Onl...Amazon Web Services
Amazon Aurora is now PostgreSQL compatible. With Amazon Aurora’s new PostgreSQL support, customers can get several times better performance than the typical PostgreSQL database and take advantage of the scalability, durability, and security capabilities of Amazon Aurora – all for one-tenth the cost of commercial grade databases. Amazon Aurora is a fully managed relational database engine that combines the speed and availability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases. Amazon Aurora is built on a cloud native architecture that is designed to offer greater than 99.99 percent availability and automatic failover with no loss of data.
Learning Objectives:
• Learn about the capabilities and features of Amazon Aurora with PostgreSQL Compatibility
• Learn about the benefits and different use cases
• Learn how to get started using Amazon Aurora with PostgreSQL Compatibility
- Understanding Time Series
- What's the Fundamental Problem
- Prometheus Solution (v1.x)
- New Design of Prometheus (v2.x)
- Data Compression Algorithm
This talk is from Distributed Data Summit SF 2018 - http://distributeddatasummit.com/2018-sf/sessions#chella
Audit logging is one of the most critical features in an enterprise-ready database in terms of security compliance. Furthermore, live traffic troubleshooting is critical for operators to troubleshoot production issues quickly. While past versions have lacked these critical features, the Cassandra team understood the need for better solutions and in the upcoming release of Cassandra both of these features now come out of the box which makes Cassandra even more awesome to work with. Cassandra now supports Audit logging and query logging as part of C* itself. As part of this talk, audience will learn about how to enable, configure, and tune audit logging for their C* clusters and how to log live traffic/queries for serverel needs including troubleshooting or even live traffic reply
Announcing Amazon Aurora with PostgreSQL Compatibility - January 2017 AWS Onl...Amazon Web Services
Amazon Aurora is now PostgreSQL compatible. With Amazon Aurora’s new PostgreSQL support, customers can get several times better performance than the typical PostgreSQL database and take advantage of the scalability, durability, and security capabilities of Amazon Aurora – all for one-tenth the cost of commercial grade databases. Amazon Aurora is a fully managed relational database engine that combines the speed and availability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases. Amazon Aurora is built on a cloud native architecture that is designed to offer greater than 99.99 percent availability and automatic failover with no loss of data.
Learning Objectives:
• Learn about the capabilities and features of Amazon Aurora with PostgreSQL Compatibility
• Learn about the benefits and different use cases
• Learn how to get started using Amazon Aurora with PostgreSQL Compatibility
- Understanding Time Series
- What's the Fundamental Problem
- Prometheus Solution (v1.x)
- New Design of Prometheus (v2.x)
- Data Compression Algorithm
Real-time event processing monitors the incoming data stream and initiates action based on detected events like fraud, error or performance degradation. These events are often used to issue alerts and notifications, take responsive action, or to populate a monitoring dashboard. In this session, we will walk through different use cases for event processing and demonstrate how to build a scalable pipeline for tracking IoT device status. AWS services to be covered include: AWS Lambda and the Kinesis Client Library (KCL).
Everything you wanted to know about Apache Tez:
-- Distributed execution framework targeted towards data-processing applications.
-- Based on expressing a computation as a dataflow graph.
-- Highly customizable to meet a broad spectrum of use cases.
-- Built on top of YARN – the resource management framework for Hadoop.
-- Open source Apache incubator project and Apache licensed.
Writing and testing high frequency trading engines in javaPeter Lawrey
JavaOne presentation of Writing and Testing High Frequency Trading Engines in Java. Talk looks at low latency trading, thread affinity, lock free code, ultra low garbage and low latency persistence and IPC.
How to keep Jenkins logs forever without performance issuesLuca Milanesio
Jenkins is a golden source of information: it contains logs, artifacts and feedback and x-refs from multiple sources. To keep our master healthy and responsive, often we need to remove precious data. The members of the Gerrit Code Review project wanted to keep everything and this is how we did it.
Flink Forward Berlin 2017: Patrick Lucas - Flink in ContainerlandFlink Forward
Apache Flink, a powerful distributed stateful stream processing framework, is an especially good fit for deployment on a containerization platform: its storage requirement is primarily external (e.g. HDFS or S3), clusters often share the lifetime of the jobs that run on them, and the flexibility of allocating resources on such a platform allows for scaling jobs up and down as necessary. In this talk I will give a brief introduction to Apache Flink, then describe the journey to making it a first-class citizen of the container world. I will cover my experience preparing to publish the “official repository” of Flink images on Docker Hub, the challenges of fitting a Flink deployment in a Kubernetes-shaped box, and the rough edges of Flink itself that were exposed by this process.
Josh Berkus
You've heard that PostgreSQL is the highest-performance transactional open source database, but you're not seeing it on YOUR server. In fact, your PostgreSQL application is kind of poky. What should you do? While doing advanced performance engineering for really high-end systems takes years to learn, you can learn the basics to solve performance issues for 80% of PostgreSQL installations in less than an hour. In this session, you will learn: -- The parts of database application performance -- The performance setup procedure -- Basic troubleshooting tools -- The 13 postgresql.conf settings you need to know -- Where to look for more information.
Real-time event processing monitors the incoming data stream and initiates action based on detected events like fraud, error or performance degradation. These events are often used to issue alerts and notifications, take responsive action, or to populate a monitoring dashboard. In this session, we will walk through different use cases for event processing and demonstrate how to build a scalable pipeline for tracking IoT device status. AWS services to be covered include: AWS Lambda and the Kinesis Client Library (KCL).
Everything you wanted to know about Apache Tez:
-- Distributed execution framework targeted towards data-processing applications.
-- Based on expressing a computation as a dataflow graph.
-- Highly customizable to meet a broad spectrum of use cases.
-- Built on top of YARN – the resource management framework for Hadoop.
-- Open source Apache incubator project and Apache licensed.
Writing and testing high frequency trading engines in javaPeter Lawrey
JavaOne presentation of Writing and Testing High Frequency Trading Engines in Java. Talk looks at low latency trading, thread affinity, lock free code, ultra low garbage and low latency persistence and IPC.
How to keep Jenkins logs forever without performance issuesLuca Milanesio
Jenkins is a golden source of information: it contains logs, artifacts and feedback and x-refs from multiple sources. To keep our master healthy and responsive, often we need to remove precious data. The members of the Gerrit Code Review project wanted to keep everything and this is how we did it.
Flink Forward Berlin 2017: Patrick Lucas - Flink in ContainerlandFlink Forward
Apache Flink, a powerful distributed stateful stream processing framework, is an especially good fit for deployment on a containerization platform: its storage requirement is primarily external (e.g. HDFS or S3), clusters often share the lifetime of the jobs that run on them, and the flexibility of allocating resources on such a platform allows for scaling jobs up and down as necessary. In this talk I will give a brief introduction to Apache Flink, then describe the journey to making it a first-class citizen of the container world. I will cover my experience preparing to publish the “official repository” of Flink images on Docker Hub, the challenges of fitting a Flink deployment in a Kubernetes-shaped box, and the rough edges of Flink itself that were exposed by this process.
Josh Berkus
You've heard that PostgreSQL is the highest-performance transactional open source database, but you're not seeing it on YOUR server. In fact, your PostgreSQL application is kind of poky. What should you do? While doing advanced performance engineering for really high-end systems takes years to learn, you can learn the basics to solve performance issues for 80% of PostgreSQL installations in less than an hour. In this session, you will learn: -- The parts of database application performance -- The performance setup procedure -- Basic troubleshooting tools -- The 13 postgresql.conf settings you need to know -- Where to look for more information.
Performance optimization techniques for Java codeAttila Balazs
The presentation covers the the basics of performance optimizations for real-world Java code. It starts with a theoretical overview of the concepts followed by several live demos
showing how performance bottlenecks can be diagnosed and eliminated. The demos include some non-trivial multi-threaded examples
inspired by real-world applications.
Not a DBA but your work thinks you are? This is the session for you. We will give you a crash course into the most valuable variables, places to look, and gotcha's. This is not designed to replace traditional training but rather to ensure you at least have the most basic skills to ensure your foray into the world of MySQL DBA's gets off on a good foot.
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelDaniel Coupal
MongoDB presentation from Silicon Valley Code Camp 2015.
Walkthrough developing, deploying and operating a MongoDB application, avoiding the most common pitfalls.
Utopia Kingdoms scaling case. From 4 users to 50.000+Python Ireland
Describing the real life case of Utopia Kingdoms, an online game. The game had initially problems scaling on production environment and had to be greatly refactored to support large number of players. This includes use of caching, profiling, queuing system and the migration of database from Amazon SimpleDB to MongoDB.
How would you build the world’s largest, fastest, most complex Magento ecommerce store? Three COPIOUS engineers share their approaches to this problem, including best practices, code samples, and system configurations necessary to scale Magento up to 100,000 daily orders with a catalog of 100,000 products.
Updated version of my tutorial on how to give a great tech talk, this time without Ian Dees. New tutorial is longer thanks to longer talk slot. Mostly the extra time will be spent on exercises.
“PostgreSQL, Python and Squid” (otherwise known as, “using Python in PostgreSQL and PostgreSQL from Python”) presented at PyPgDay 2013 at PyCon 2013-Christophe Pettus
7. What Flavor is Your DB? O
1
W ►Web Application (Web)
●DB smaller than RAM
●90% or more “one-liner” queries
8. What Flavor is Your DB? O
1
O ►Online Transaction Processing
(OLTP)
●DB slightly larger than RAM to 1TB
●20-70% small data write queries,
some large transactions
9. What Flavor is Your DB? O
1
D ►Data Warehousing (DW)
●Large to huge databases (100GB to
100TB)
●Large complex reporting queries
●Large bulk loads of data
●Also called "Decision Support" or
"Business Intelligence"
10. Tips for Good Form O
1
►Engineer for the problems you have
●not for the ones you don't
11. Tips for Good Form O
1
►A little overallocation is cheaper than
downtime
●unless you're an OEM, don't stint a few
GB
●resource use will grow over time
12. Tips for Good Form O
1
►Test, Tune, and Test Again
●you can't measure performance by “it
seems fast”
13. Tips for Good Form O
1
►Most server performance is
thresholded
●“slow” usually means “25x slower”
●it's not how fast it is, it's how close you
are to capacity
15. Schema Design 1
1
►Table design
●do not optimize prematurely
▬normalize your tables and wait for a proven
issue to denormalize
▬Postgres is designed to perform well with
normalized tables
●Entity-Attribute-Value tables and other
innovative designs tend to perform poorly
16. Schema Design 1
1
►Table design
●consider using natural keys
▬can cut down on the number of joins you
need
●BLOBs can be slow
▬have to be completely rewritten,
compressed
▬can also be fast, thanks to compression
17. Schema Design 1
1
►Table design
●think of when data needs to be updated,
as well as read
▬sometimes you need to split tables which
will be updated at different times
▬don't trap yourself into updating the same
rows multiple times
18. Schema Design 1
1
►Indexing
●index most foreign keys
●index common WHERE criteria
●index common aggregated columns
●learn to use special index types:
expressions, full text, partial
19. Schema Design 1
1
►Not Indexing
●indexes cost you on updates, deletes
▬especially with HOT
●too many indexes can confuse the
planner
●don't index: tiny tables, low-cardinality
columns
20. Right indexes? 1
1
►pg_stat_user_indexes
●shows indexes not being used
●note that it doesn't record unique index
usage
►pg_stat_user_tables
●shows seq scans: index candidates?
●shows heavy update/delete tables: index
less
21. Partitioning 1
1
►Partition large or growing tables
●historical data
▬data will be purged
▬massive deletes are server-killers
●very large tables
▬anything over 10GB / 10m rows
▬partition by active/passive
22. Partitioning 1
1
►Application must be partition-compliant
●every query should call the partition key
●pre-create your partitions
▬do not create them on demand … they will
lock
23. Query design 1
1
►Do more with each query
●PostgreSQL does well with fewer larger
queries
●not as well with many small queries
●avoid doing joins, tree-walking in
middleware
24. Query design 1
1
►Do more with each transaction
●batch related writes into large
transactions
25. Query design 1
1
►Know the query gotchas (per version)
●Always try rewriting subqueries as joins
●try swapping NOT IN and NOT EXISTS
for bad queries
●try to make sure that index/key types
match
●avoid unanchored text searches "ILIKE
'%josh%'"
26. But I use ORM! 1
1
►ORM != high performance
●ORM is for fast development, not fast
databases
●make sure your ORM allows "tweaking"
queries
●applications which are pushing the limits
of performance probably can't use ORM
▬but most don't have a problem
27. It's All About Caching 1
1
►Use prepared queries W O
●whenever you have repetitive loops
28. It's All About Caching 1
1
►Cache, cache everywhere W O
●plan caching: on the PostgreSQL server
●parse caching: in some drivers
●data caching:
▬in the appserver
▬in memcached/varnish/nginx
▬in the client (javascript, etc.)
●use as many kinds of caching as you can
29. It's All About Caching 1
1
But …
►think carefully about cache invalidation
●and avoid “cache storms”
30. Connection Management 1
1
►Connections take resources W O
●RAM, CPU
●transaction checking
31. Connection Management 1
1
►Make sure you're only using W O
connections you need
●look for “<IDLE>” and “<IDLE> in
Transaction”
●log and check for a pattern of connection
growth
●make sure that database and appserver
timeouts are synchronized
32. Pooling 1
1
►Over 100 connections? You need
pooling!
Webserver
Webserver Pool PostgreSQL
Webserver
33. Pooling 1
1
►New connections are expensive
●use persistent connections or connection
pooling sofware
▬appservers
▬pgBouncer
▬pgPool (sort of)
●set pool side to maximum connections
needed
36. Optimize Your Queries
1
2
in Test
►Before you go production
●simulate user load on the application
●monitor and fix slow queries
●look for worst procedures
37. Optimize Your Queries
1
2
in Test
►Look for “bad queries”
●queries which take too long
●data updates which never complete
●long-running stored procedures
●interfaces issuing too many queries
●queries which block
38. Finding bad queries 1
2
►Log Analysis
●dozens of logging
options
●log_min_duration_
statement
●pgfouine
●pgBadger
40. Fixing bad queries 1
2
►reading explain analyze is an art
●it's an inverted tree
●look for the deepest level at which the
problem occurs
►try re-writing complex queries several
ways
46. max_connections 3
1
►As many as you need to use
●web apps: 100 to 300 W O
D
●analytics: 20 to 40
►If you need more than 100 regularly,
use a connection pooler
●like pgbouncer
47. shared_buffers 3
1
►1/4 of RAM on a dedicated server
W O
●not more than 8GB (test)
●cache_miss statistics can tell you if you
need more
►less buffers to preserve cache space
D
48. Other memory parameters 3
1
►work_mem
●non-shared
▬lower it for many connections W O
▬raise it for large queries D
●watch for signs of misallocation
▬swapping RAM: too much work_mem
▬log temp files: not enough work_mem
●probably better to allocate by task/ROLE
49. Other memory parameters 3
1
►maintenance_work_mem
●the faster vacuum completes, the better
▬but watch out for multiple autovacuum
workers!
●raise to 256MB to 1GB for large
databases
●also used for index creation
▬raise it for bulk loads
50. Other memory parameters 3
1
►temp_buffers
●max size of temp tables before swapping
to disk
●raise if you use lots of temp tables D
►wal_buffers
●raise it to 32MB
51. Commits 3
1
►checkpoint_segments
●more if you have the disk: 16, 64, 128
►synchronous_commit
●response time more important than data
integrity?
●turn synchronous_commit = off W
●lose a finite amount of data in a
shutdown
52. Query tuning 3
1
►effective_cache_size
●RAM available for queries
●set it to 3/4 of your available RAM
►default_statistics_target D
●raise to 200 to 1000 for large databases
●now defaults to 100
●setting statistics per column is better
53. Query tuning 3
1
►effective_io_concurrency
●set to number of disks or channels
●advisory only
●Linux only
54. A word about
Random Page Cost
3
1
►Abused as a “force index use”
parameter
►Lower it if the seek/scan ratio of your
storage is actually different
●SSD/NAND: 1.0 to 2.0
●EC2: 1.1 to 2.0
●High-end SAN: 2.5 to 3.5
►Never below 1.0
55. Maintenance 3
1
►Autovacuum
●leave it on for any application which gets
constant writes W O
●not so good for batch writes -- do manual
vacuum for bulk loads D
56. Maintenance 3
1
►Autovacuum
●have 100's or 1000's of tables?
multiple_autovacuum_workers
▬but not more than ½ cores
●large tables? raise
autovacuum_vacuum_cost_limit
●you can change settings per table
58. Spread Your Files Around 1
4
►Separate the transaction log if O D
possible
●pg_xlog directory
●on a dedicated disk/array, performs
10-50% faster
●many WAL options only work if you have
a separate drive
59. Spread Your Files Around 1
4
number of drives/arrays 1 2 3
which partition
OS/applications 1 1 1
transaction log 1 1 2
database 1 2 3
60. Spread Your Files Around 1
4
►Tablespaces for temp files D
●more frequently useful if you do a lot of
disk sorts
●Postgres can round-robin multiple temp
tablespaces
61. Linux Tuning 1
4
►Filesystems
●Use XFS or Ext4
▬butrfs not ready yet, may never work for DB
▬Ext3 has horrible flushing behavior
●Reduce logging
▬data=ordered, noatime,
nodiratime
62. Linux Tuning 1
4
►OS tuning
●must increase shmmax, shmall in kernel
●use deadline or noop scheduler to speed
writes
●disable NUMA memory localization
(recent)
●check your kernel version carefully for
performance issues!
63. Linux Tuning 1
4
►Turn off the OOM Killer!
● vm.oom-kill = 0
● vm.overcommit_memory = 2
● vm.overcommit_ratio = 80
64. OpenSolaris/IIlumos 1
4
►Filesystems
●Use ZFS
▬reduce block size to 8K W O
●turn off full_page_writes
►OS configuration
●no need to configure shared memory
●use packages compiled with Sun
compiler
66. What about The Cloud? 1
4
►Configuring for cloud servers is
different
●shared resources
●unreliable I/O
●small resource limits
►Also depends on which cloud
●AWS, Rackspace, Joyent, GoGrid
… so I can't address it all here.
67. What about The Cloud? 1
4
►Some general advice:
●make sure your database fits in RAM
▬except on Joyent
●Don't bother with most OS/FS tuning
▬just some basic FS configuration options
●use synchronous_commit = off if
possible
68. Set up Monitoring! 1
4
►Get warning ahead of time
●know about performance problems
before they go critical
●set up alerts
▬80% of capacity is an emergency!
●set up trending reports
▬is there a pattern of steady growth
70. Hardware Basics 1
5
►Four basic components:
●CPU
●RAM
●I/O: Disks and disk bandwidth
●Network
71. Hardware Basics 1
5
►Different priorities for different
applications
●Web: CPU, Network, RAM, ... I/O W
●OLTP: balance all O
●DW: I/O, CPU, RAM D
72. Getting Enough CPU 1
5
►One Core, One Query
●How many concurrent queries do you
need?
●Best performance at 1 core per no more
than two concurrent queries
►So if you can up your core count, do
►Also: L1, L2 cache size matters
73. Getting Enough RAM 1
5
►RAM use is "thresholded"
●as long as you are above the amount of
RAM you need, even 5%, server will be
fast
●go even 1% over and things slow down a
lot
74. Getting Enough RAM 1
5
►Critical RAM thresholds W
●Do you have enough RAM to keep the
database in shared_buffers?
▬Ram 3x to 6x the size of DB
75. Getting Enough RAM 1
5
►Critical RAM thresholds O
●Do you have enough RAM to cache the
whole database?
▬RAM 2x to 3x the on-disk size of the
database
●Do you have enough RAM to cache the
“working set”?
▬the data which is needed 95% of the time
76. Getting Enough RAM 1
5
►Critical RAM thresholds D
●Do you have enough RAM for sorts &
aggregates?
▬What's the largest data set you'll need to
work with?
▬For how many users
77. Other RAM Issues 1
5
►Get ECC RAM
●Better to know about bad RAM before it
corrupts your data.
►What else will you want RAM for?
●RAMdisk?
●SWRaid?
●Applications?
78. Getting Enough I/O 1
5
►Will your database be I/O Bound?
●many writes: bound by transaction log O
●database much larger than RAM: bound
by I/O for many/most queries D
79. Getting Enough I/O 1
5
►Optimize for the I/O you'll need
●if you DB is terabytes, spend most of
your money on disks
●calculate how long it will take to read
your entire database from disk
▬backups
▬snapshots
●don't forget the transaction log!
80. I/O Decision Tree 1
5
lots of fits in
No Yes mirrored
writes? RAM?
Yes No
afford
terabytes HW RAID
good HW Yes No
of data?
RAID?
Yes
No
mostly
SW RAID Storage read?
Device
Yes No
RAID 5 RAID 1+0
81. I/O Tips 1
5
►RAID
●get battery backup and turn your write
cache on
●SAS has 2x the real throughput of SATA
●more spindles = faster database
▬big disks are generally slow
82. I/O Tips 1
5
►DAS/SAN/NAS
●measure lag time: it can kill response
time
●how many channels?
▬“gigabit” is only 100mb/s
▬make sure multipath works
●use fiber if you can afford it
84. SSD 1
5
►Very fast seeks D
●great for index access on large tables
●up to 20X faster
►Not very fast random writes
●low-end models can be slower than HDD
●most are about 2X speed
►And use server models, not desktop!
85. NAND (FusionIO) 1
5
All the advantages of SSD, Plus:
►Very fast writes ( 5X to 20X ) W O
●more concurrency on writes
●MUCH lower latency
►But … very expensive (50X)
86. Tablespaces for NVRAM 1
5
►Have a "hot" and a "cold" tablespace
●current data on "hot" O D
●older/less important data on "cold"
●combine with partitioning
►compromise between speed and size
87. Network 1
5
►Network can be your bottleneck
●lag time
●bandwith
●oversubscribed switches
●NAS
88. Network 1
5
►Have dedicated connections
●between appserver and database server
●between database server and failover
server
●between database and storage
89. Network 1
5
►Data Transfers
●Gigabit is only 100MB/s
●Calculate capacity for data copies,
standby, dumps
90. The Most Important
Hardware Advice:
1
5
►Quality matters
●not all CPUs are the same
●not all RAID cards are the same
●not all server systems are the same
●one bad piece of hardware, or bad driver,
can destroy your application
performance
91. The Most Important
Hardware Advice:
1
5
►High-performance databases means
hardware expertise
●the statistics don't tell you everything
●vendors lie
●you will need to research different
models and combinations
●read the pgsql-performance mailing list
92. The Most Important
Hardware Advice:
1
5
►Make sure you test your hardware
before you put your database on it
●“Try before you buy”
●Never trust the vendor or your sysadmins
93. The Most Important
Hardware Advice:
1
5
►So Test, Test, Test!
●CPU: PassMark, sysbench, Spec CPU
●RAM: memtest, cachebench, Stream
●I/O: bonnie++, dd, iozone
●Network: bwping, netperf
●DB: pgBench, sysbench
94. Questions? 1
6
►Josh Berkus ►More Advice
● josh@pgexperts.com ● www.postgresql.org/docs
● www.pgexperts.com ● pgsql-performance
▬ /presentations.html mailing list
● www.databasesoup.com ● planet.postgresql.org
● irc.freenode.net
▬ #postgresql
This talk is copyright 2013 Josh Berkus, and is licensed under the creative commons attribution license