SlideShare a Scribd company logo
1 of 94
Download to read offline
What Assumptions Make:
Performance Testing with P4
     Portland PostgreSQL Performance Pad




            Selena Deckelmann
           selena@endpoint.com
           End Point Corporation
           twitter: @selenamarie
www.endpoint.com
                                    09
                                 20
                            2,
                         -2
                       21
                   b
                Fe
           7x
       E
     AL
SC
www.endpoint.com
                                    09
                                 20
                            2,
                         -2
                       21
                   b
                Fe
           7x
       E
     AL
SC
www.endpoint.com
                                    09
                                 20
                            2,
                         -2
                       21
                   b
                Fe
           7x
       E
     AL
SC
Do filesystems do
                          what we expect?
SC
 AL
     E
     7x
         Fe
            b
           21




                                          www.endpoint.com
                -2




                                  5
                2,
                 20
                     09
We are volunteers.
SC
 AL
     E
     7x
         Fe
            b
           21




                                          www.endpoint.com
                -2




                                  6
                2,
                 20
                     09
We think you should run
                           these tests.
SC
 AL
     E
     7x
         Fe
            b
           21




                                        www.endpoint.com
                -2




                                7
                2,
                 20
                     09
We are:
                                  DBAs
                               Sysadmins
                          Performance tuners
SC
 AL
     E
     7x
         Fe
            b
           21




                                          www.endpoint.com
                -2




                                  8
                2,
                 20
                     09
How will this hardware
                                perform?
SC
 AL
     E
     7x
         Fe
            b
           21




                                            www.endpoint.com
                -2




                                    9
                2,
                 20
                     09
How will this filesystem
                           perform?
SC
 AL
     E
     7x
         Fe
            b
           21




                                         www.endpoint.com
                -2




                                10
                2,
                 20
                     09
Why should you care about
                        filesystem-specific
                           performance?
SC
 AL
     E
     7x
         Fe
            b
           21




                                         www.endpoint.com
                -2




                                 11
                2,
                 20
                     09
Expectations
SC
 AL
     E
     7x
         Fe
            b
           21




                                         www.endpoint.com
                -2




                               12
                2,
                 20
                     09
PERSONAL CONFESSION
SC
 AL
     E
     7x
         Fe
            b
           21




                                           www.endpoint.com
                -2




                                   13
                2,
                 20
                     09
Where to start?
SC
 AL
     E
     7x
         Fe
            b
           21




                                            www.endpoint.com
                -2




                                 14
                2,
                 20
                     09
The Defaults.
SC
 AL
     E
     7x
         Fe
            b
           21




                                          www.endpoint.com
                -2




                                15
                2,
                 20
                     09
www.endpoint.com
16
Not addressing reliability
SC
 AL
     E
     7x
         Fe
            b
           21




                                               www.endpoint.com
                -2




                                      17
                2,
                 20
                     09
Very Narrow Use Case:
                          A Relational Database
SC
 AL
     E
     7x
         Fe
            b
           21




                                            www.endpoint.com
                -2




                                   18
                2,
                 20
                     09
Need for periodic testing.
                        (And we've got some
                             hardware!)
SC
 AL
     E
     7x
         Fe
            b
           21




                                           www.endpoint.com
                -2




                                  19
                2,
                 20
                     09
★Kernel differences
                          ★FS patch-level differences
                          ★Mount options
                          ★mkfs options
SC
 AL
     E
     7x
         Fe
            b
           21




                                                www.endpoint.com
                -2




                                      20
                2,
                 20
                     09
Focused on
                                THROUGHPUT
                          (Because that’s what people who
                             buy large systems look for)
SC
 AL
     E
     7x
         Fe
            b
           21




                                                    www.endpoint.com
                -2




                                        21
                2,
                 20
                     09
Later:
                             Response Time
                          Operations per second
SC
 AL
     E
     7x
         Fe
            b
           21




                                            www.endpoint.com
                -2




                                   22
                2,
                 20
                     09
No, we will not
     be testing ZFS.
SC
 AL
     E
     7x
         Fe
            b
           21




                               www.endpoint.com
                -2




                          23
                2,
                 20
                     09
FS




                BtrFS
                (nope, not yet)
SC
 AL
     E
     7x
         Fe
            b
           21




                                    www.endpoint.com
                -2




                               24
                2,
                 20
                     09
What do we expect?
SC
 AL
     E
     7x
         Fe
            b
           21




                                          www.endpoint.com
                -2




                                  25
                2,
                 20
                     09
Some conventional
                              wisdom:
SC
 AL
     E
     7x
         Fe
            b
           21




                                         www.endpoint.com
                -2




                                  26
                2,
                 20
                     09
“RAID5 is the
                            worst choice
                          for a database.”
SC
 AL
     E
     7x
         Fe
            b
           21




                                         www.endpoint.com
                -2




                                 27
                2,
                 20
                     09
“LVM incurs
                          too much overhead
                                to use.”
SC
 AL
     E
     7x
         Fe
            b
           21




                                         www.endpoint.com
                -2




                                  28
                2,
                 20
                     09
“Striping doubles
                           performance.”
SC
 AL
     E
     7x
         Fe
            b
           21




                                          www.endpoint.com
                -2




                                  29
                2,
                 20
                     09
“Turning off 'atime'
                                is a big
                          performance gain.”
SC
 AL
     E
     7x
         Fe
            b
           21




                                            www.endpoint.com
                -2




                                   30
                2,
                 20
                     09
“Getting rid of atime
                          updates would give us
                          more everyday Linux
                          performance than all
                          the pagecache speedups
                          of the last 10 years,
                          _combined_.”
SC
 AL
     E
     7x
         Fe
            b
           21




                                            www.endpoint.com
                -2




                              31
                2,
                 20
                     09
“Journaling filesystems
                      (ext3) will have worse
                      performance than non-
                       journaling filesystems
                              (ext2).”
SC
 AL
     E
     7x
         Fe
            b
           21




                                         www.endpoint.com
                -2




                                 32
                2,
                 20
                     09
“Your read-ahead
                                buffer
                           is big enough.”
SC
 AL
     E
     7x
         Fe
            b
           21




                                         www.endpoint.com
                -2




                                 33
                2,
                 20
                     09
Now... on to the good stuff.
SC
 AL
     E
     7x
         Fe
            b
           21




                                            www.endpoint.com
                -2




                                  34
                2,
                 20
                     09
www.endpoint.com
                            35
                                    09
                                 20
                            2,
                         -2
                       21
                   b
                Fe
           7x
       E
     AL
SC
PostgreSQL’s Portland Performance Pad
SC
 AL




                          Hosted by CommandPrompt, Inc.
     E
     7x
         Fe
            b
           21




                                                          www.endpoint.com
                -2




                                        36
                2,
                 20
                     09
Our machine:

                HP ProLiant DL380G5
                Smart Array p800

                72GB 15,000 RPM SAS (up to 25 disks)
                32GB RAM

                Linux:
                2.6.25-gentoo-r6
                *New tests being run with 2.6.28
SC
 AL
     E
     7x
         Fe
            b
           21




                                               www.endpoint.com
                -2




                                   37
                2,
                 20
                     09
Our machine:
                             Chosen because
                          of it’s low, low price.

                          Thank you, HP.
SC
 AL
     E
     7x
         Fe
            b
           21




                                                    www.endpoint.com
                -2




                                     38
                2,
                 20
                     09
Our tests:
                          fio
                          64 GB working set
                          8 threads
                          no fadvise
                          no direct i/o
                          8KB blocksize
                          I/O elevator: deadline
SC
 AL
     E
     7x
         Fe
            b
           21




                                                   www.endpoint.com
                -2




                                        39
                2,
                 20
                     09
Our stats:
                          sar
                          mpstat
                          iostat
                          vmstat
                          readprofile
SC
 AL
     E
     7x
         Fe
            b
           21




                                                     www.endpoint.com
                -2




                                           40
                2,
                 20
                     09
Our tests:
                     Chosen because of their
                     relevance to PostgreSQL
SC
 AL
     E
     7x
         Fe
            b
           21




                                        www.endpoint.com
                -2




                                41
                2,
                 20
                     09
Filesystems Tested:

                 ext2
                 ext3
                 jfs
                 xfs
                 reiserfs
                 ext4 (but having trouble)
SC
 AL
     E
     7x
         Fe
            b
           21




                                             www.endpoint.com
                -2




                                   42
                2,
                 20
                     09
Disk configs tested:

                     Single disk
                     RAID-0
                     RAID-1
                     RAID-5
                     RAID-10
                     RAID-6
SC
 AL
     E
     7x
         Fe
            b
           21




                                             www.endpoint.com
                -2




                                   43
                2,
                 20
                     09
The Data:
                     http://moourl.com/fsperf
SC
 AL
     E
     7x
         Fe
            b
           21




                                         www.endpoint.com
                -2




                                44
                2,
                 20
                     09
Confessions:
                     • May be high standard deviation with
                      results (don’t know yet!)

                     •No filesystem tuning, all default create
                      and mount options

                     •No software raid comparison or lvm
                      (volume management test) for 2.6.28
                      tests
SC
 AL
     E
     7x
         Fe
            b
           21




                                                       www.endpoint.com
                -2




                                         45
                2,
                 20
                     09
Confessions:
                     • Some xfs runs had to be repeated and
                      some ext4 runs did not complete
                      successfully

                     • Only presenting throughput

                     • Interested in system performance for a
                      specific application, not code
                      performance
SC
 AL
     E
     7x
         Fe
            b
           21




                                                      www.endpoint.com
                -2




                                        46
                2,
                 20
                     09
Confessions:
                     •I/O profiles don’t exhibit atime or
                      partition alignment issues

                     •Disk controller firmware not at the
                      latest version in 2.6.25 tests

                     •Software RAID is on top of 1 disk RAID 0
                      devices (HP SmartArray doesn’t have
                      JBOD option)
SC
 AL
     E
     7x
         Fe
            b
           21




                                                       www.endpoint.com
                -2




                                         47
                2,
                 20
                     09
AUDIENCE PARTICIPATION
                            Higher throughput:
                               ext2 or ext3?
SC
 AL
     E
     7x
         Fe
            b
           21




                                            www.endpoint.com
                -2




                                    48
                2,
                 20
                     09
www.endpoint.com
                            49
                                    09
                                 20
                            2,
                         -2
                       21
                   b
                Fe
           7x
       E
     AL
SC
www.endpoint.com
                            50
                                    09
                                 20
                            2,
                         -2
                       21
                   b
                Fe
           7x
       E
     AL
SC
www.endpoint.com
                            51
                                    09
                                 20
                            2,
                         -2
                       21
                   b
                Fe
           7x
       E
     AL
SC
www.endpoint.com
                            52
                                    09
                                 20
                            2,
                         -2
                       21
                   b
                Fe
           7x
       E
     AL
SC
Seek bundling/batching
                         in ext3 is better?
SC
 AL
     E
     7x
         Fe
            b
           21




                                        www.endpoint.com
                -2




                                53
                2,
                 20
                     09
What if we add a disk?
SC
 AL
     E
     7x
         Fe
            b
           21




                                            www.endpoint.com
                -2




                                    54
                2,
                 20
                     09
www.endpoint.com
                            55
                                    09
                                 20
                            2,
                         -2
                       21
                   b
                Fe
           7x
       E
     AL
SC
www.endpoint.com
                            56
                                    09
                                 20
                            2,
                         -2
                       21
                   b
                Fe
           7x
       E
     AL
SC
www.endpoint.com
                            57
                                    09
                                 20
                            2,
                         -2
                       21
                   b
                Fe
           7x
       E
     AL
SC
www.endpoint.com
                            58
                                    09
                                 20
                            2,
                         -2
                       21
                   b
                Fe
           7x
       E
     AL
SC
www.endpoint.com
                            59
                                    09
                                 20
                            2,
                         -2
                       21
                   b
                Fe
           7x
       E
     AL
SC
AUDIENCE PARTICIPATION
                           RAID 0 (stripe) versus
                            RAID 1 (mirroring)
                               performance?
SC
 AL
     E
     7x
         Fe
            b
           21




                                               www.endpoint.com
                -2




                                    60
                2,
                 20
                     09
www.endpoint.com
                            61
                                    09
                                 20
                            2,
                         -2
                       21
                   b
                Fe
           7x
       E
     AL
SC
www.endpoint.com
                            62
                                    09
                                 20
                            2,
                         -2
                       21
                   b
                Fe
           7x
       E
     AL
SC
www.endpoint.com
                            63
                                    09
                                 20
                            2,
                         -2
                       21
                   b
                Fe
           7x
       E
     AL
SC
What happens when we:
                              add disks to a
                           RAID 0 (stripe) LUN?
SC
 AL
     E
     7x
         Fe
            b
           21




                                              www.endpoint.com
                -2




                                   64
                2,
                 20
                     09
www.endpoint.com
                            65
                                    09
                                 20
                            2,
                         -2
                       21
                   b
                Fe
           7x
       E
     AL
SC
www.endpoint.com
                            66
                                    09
                                 20
                            2,
                         -2
                       21
                   b
                Fe
           7x
       E
     AL
SC
www.endpoint.com
                            67
                                    09
                                 20
                            2,
                         -2
                       21
                   b
                Fe
           7x
       E
     AL
SC
www.endpoint.com
                            68
                                    09
                                 20
                            2,
                         -2
                       21
                   b
                Fe
           7x
       E
     AL
SC
Adding disks to a
                            RAID 5 LUN
SC
 AL
     E
     7x
         Fe
            b
           21




                                          www.endpoint.com
                -2




                                 69
                2,
                 20
                     09
www.endpoint.com
                            70
                                    09
                                 20
                            2,
                         -2
                       21
                   b
                Fe
           7x
       E
     AL
SC
www.endpoint.com
                            71
                                    09
                                 20
                            2,
                         -2
                       21
                   b
                Fe
           7x
       E
     AL
SC
www.endpoint.com
                            72
                                    09
                                 20
                            2,
                         -2
                       21
                   b
                Fe
           7x
       E
     AL
SC
Only have 4 disks?
                          What should you do?
SC
 AL
     E
     7x
         Fe
            b
           21




                                          www.endpoint.com
                -2




                                  73
                2,
                 20
                     09
www.endpoint.com
                            74
                                    09
                                 20
                            2,
                         -2
                       21
                   b
                Fe
           7x
       E
     AL
SC
www.endpoint.com
                            75
                                    09
                                 20
                            2,
                         -2
                       21
                   b
                Fe
           7x
       E
     AL
SC
www.endpoint.com
                            76
                                    09
                                 20
                            2,
                         -2
                       21
                   b
                Fe
           7x
       E
     AL
SC
www.endpoint.com
                            77
                                    09
                                 20
                            2,
                         -2
                       21
                   b
                Fe
           7x
       E
     AL
SC
In most cases, RAID 5 out performs
                               on sequential writes (xlog).

                          Random writes is only an improvement
                                  on xfs and reiserfs.
SC
 AL
     E
     7x
         Fe
            b
           21




                                                        www.endpoint.com
                -2




                                          78
                2,
                 20
                     09
Are software RAID
                          and LVM are slow?
SC
 AL
     E
     7x
         Fe
            b
           21




                                         www.endpoint.com
                -2




                                 79
                2,
                 20
                     09
www.endpoint.com
                            80
                                    09
                                 20
                            2,
                         -2
                       21
                   b
                Fe
           7x
       E
     AL
SC
www.endpoint.com
                            81
                                    09
                                 20
                            2,
                         -2
                       21
                   b
                Fe
           7x
       E
     AL
SC
The Read-ahead buffer
SC
 AL
     E
     7x
         Fe
            b
           21




                                           www.endpoint.com
                -2




                                    82
                2,
                 20
                     09
AUDIENCE PARTICIPATION
                                Readahead buffer:
                                 Default is 128 K
                          What do you think it should be?
SC
 AL
     E
     7x
         Fe
            b
           21




                                                   www.endpoint.com
                -2




                                        83
                2,
                 20
                     09
www.endpoint.com
                            84
                                    09
                                 20
                            2,
                         -2
                       21
                   b
                Fe
           7x
       E
     AL
SC
And is there a cost to
                          increasing the buffer
                               that much?
SC
 AL
     E
     7x
         Fe
            b
           21




                                             www.endpoint.com
                -2




                                    85
                2,
                 20
                     09
www.endpoint.com
                            86
                                    09
                                 20
                            2,
                         -2
                       21
                   b
                Fe
           7x
       E
     AL
SC
http://moourl.com/readaheadconfirm
SC
 AL
     E
     7x
         Fe
            b
           21




                                                           www.endpoint.com
                -2




                                          87
                2,
                 20
                     09
Future Work
                          •OLTP system characterization,
                           sizing
                          •Daily OLTP regression testing
                          •More presentations
                          •P5 - PostgreSQL Portland
                           Performance Pad PRACTICE
SC
 AL
     E
     7x
         Fe
            b
           21




                                                   www.endpoint.com
                -2




                                        88
                2,
                 20
                     09
MOAR Hardware?
                     Thanks again, HP!
                     MSA70, DL380 in 2009 ??
SC
 AL
     E
     7x
         Fe
            b
           21




                                               www.endpoint.com
                -2




                                   89
                2,
                 20
                     09
Let’s recap...
SC
 AL
     E
     7x
         Fe
            b
           21




                                           www.endpoint.com
                -2




                                90
                2,
                 20
                     09
“RAID5 is the worst choice for a
          database.” Fast for sequential writes in
          our tests.

          “LVM incurs too much overhead to use.
          Software RAID is slower.” For reads –
          throughput is about the same, but saw
          higher CPU.

          “Turning off 'atime' is a big performance
          gain.” Not in our tests. But, 2-3% for
          “free”.
SC
 AL
     E
     7x
         Fe
            b
           21




                                              www.endpoint.com
                -2




                                91
                2,
                 20
                     09
“Journaling filesystems will have worse
          performance than non-journaling
          filesystems.” Turn the data journaling
          off on ext3, and you do see better
          performance, but there are edge cases
          and performance differences we could
          not explain.

          “Striping doubles performance.”
          Performance is better, but no where
          near double. Why?
SC
 AL
     E
     7x
         Fe
            b
           21




                                             www.endpoint.com
                -2




                               92
                2,
                 20
                     09
“Your read-ahead buffer is big enough.”
          Your read-ahead buffer IS NOT big
          enough. Make it 8MB. And can we make
          that the default?
SC
 AL
     E
     7x
         Fe
            b
           21




                                            www.endpoint.com
                -2




                               93
                2,
                 20
                     09
Thank you!
                                         Results:
                             http://wiki.postgresql.org/wiki/
                            HP_ProLiant_DL380_G5_Tuning_Guide



                          http://moourl.com/fsperf

                                    Selena Deckelmann
                                  selena@endpoint.com
                                   twitter: @selenamarie
SC
 AL
     E
     7x
         Fe
            b
           21




                                                                www.endpoint.com
                -2




                                           94
                2,
                 20
                     09

More Related Content

Viewers also liked

Viewers also liked (20)

Lessons Learned with Spark at the US Patent & Trademark Office-(Christopher B...
Lessons Learned with Spark at the US Patent & Trademark Office-(Christopher B...Lessons Learned with Spark at the US Patent & Trademark Office-(Christopher B...
Lessons Learned with Spark at the US Patent & Trademark Office-(Christopher B...
 
The Hot Rod Protocol in Infinispan
The Hot Rod Protocol in InfinispanThe Hot Rod Protocol in Infinispan
The Hot Rod Protocol in Infinispan
 
Advanced Data Retrieval and Analytics with Apache Spark and Openstack Swift
Advanced Data Retrieval and Analytics with Apache Spark and Openstack SwiftAdvanced Data Retrieval and Analytics with Apache Spark and Openstack Swift
Advanced Data Retrieval and Analytics with Apache Spark and Openstack Swift
 
Accelerating Cassandra Workloads on Ceph with All-Flash PCIE SSDS
Accelerating Cassandra Workloads on Ceph with All-Flash PCIE SSDSAccelerating Cassandra Workloads on Ceph with All-Flash PCIE SSDS
Accelerating Cassandra Workloads on Ceph with All-Flash PCIE SSDS
 
Scaling up genomic analysis with ADAM
Scaling up genomic analysis with ADAMScaling up genomic analysis with ADAM
Scaling up genomic analysis with ADAM
 
ELC-E 2010: The Right Approach to Minimal Boot Times
ELC-E 2010: The Right Approach to Minimal Boot TimesELC-E 2010: The Right Approach to Minimal Boot Times
ELC-E 2010: The Right Approach to Minimal Boot Times
 
Velox: Models in Action
Velox: Models in ActionVelox: Models in Action
Velox: Models in Action
 
Naïveté vs. Experience
Naïveté vs. ExperienceNaïveté vs. Experience
Naïveté vs. Experience
 
SparkR: Enabling Interactive Data Science at Scale
SparkR: Enabling Interactive Data Science at ScaleSparkR: Enabling Interactive Data Science at Scale
SparkR: Enabling Interactive Data Science at Scale
 
SampleClean: Bringing Data Cleaning into the BDAS Stack
SampleClean: Bringing Data Cleaning into the BDAS StackSampleClean: Bringing Data Cleaning into the BDAS Stack
SampleClean: Bringing Data Cleaning into the BDAS Stack
 
OpenStack Cheat Sheet V2
OpenStack Cheat Sheet V2OpenStack Cheat Sheet V2
OpenStack Cheat Sheet V2
 
Lab 5: Interconnecting a Datacenter using Mininet
Lab 5: Interconnecting a Datacenter using MininetLab 5: Interconnecting a Datacenter using Mininet
Lab 5: Interconnecting a Datacenter using Mininet
 
A Curious Course on Coroutines and Concurrency
A Curious Course on Coroutines and ConcurrencyA Curious Course on Coroutines and Concurrency
A Curious Course on Coroutines and Concurrency
 
Spark on Mesos-A Deep Dive-(Dean Wampler and Tim Chen, Typesafe and Mesosphere)
Spark on Mesos-A Deep Dive-(Dean Wampler and Tim Chen, Typesafe and Mesosphere)Spark on Mesos-A Deep Dive-(Dean Wampler and Tim Chen, Typesafe and Mesosphere)
Spark on Mesos-A Deep Dive-(Dean Wampler and Tim Chen, Typesafe and Mesosphere)
 
Best Practices for Virtualizing Apache Hadoop
Best Practices for Virtualizing Apache HadoopBest Practices for Virtualizing Apache Hadoop
Best Practices for Virtualizing Apache Hadoop
 
In Search of the Perfect Global Interpreter Lock
In Search of the Perfect Global Interpreter LockIn Search of the Perfect Global Interpreter Lock
In Search of the Perfect Global Interpreter Lock
 
Python in Action (Part 2)
Python in Action (Part 2)Python in Action (Part 2)
Python in Action (Part 2)
 
Introduction to Docker by Adrian Mouat
Introduction to Docker by Adrian MouatIntroduction to Docker by Adrian Mouat
Introduction to Docker by Adrian Mouat
 
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
 
Infinispan for Dummies
Infinispan for DummiesInfinispan for Dummies
Infinispan for Dummies
 

More from Selena Deckelmann

Mistakes were made - LCA 2012
Mistakes were made - LCA 2012Mistakes were made - LCA 2012
Mistakes were made - LCA 2012
Selena Deckelmann
 
Harder, better, faster, stronger: PostgreSQL 9.1
Harder, better, faster, stronger: PostgreSQL 9.1Harder, better, faster, stronger: PostgreSQL 9.1
Harder, better, faster, stronger: PostgreSQL 9.1
Selena Deckelmann
 
Own it: working with a changing open source community
Own it: working with a changing open source communityOwn it: working with a changing open source community
Own it: working with a changing open source community
Selena Deckelmann
 
Managing terabytes: When Postgres gets big
Managing terabytes: When Postgres gets bigManaging terabytes: When Postgres gets big
Managing terabytes: When Postgres gets big
Selena Deckelmann
 

More from Selena Deckelmann (20)

Algorithms are Recipes
Algorithms are RecipesAlgorithms are Recipes
Algorithms are Recipes
 
Hire the right way
Hire the right wayHire the right way
Hire the right way
 
Mistakes were made - LCA 2012
Mistakes were made - LCA 2012Mistakes were made - LCA 2012
Mistakes were made - LCA 2012
 
Pg92 HA, LCA 2012, Ballarat
Pg92 HA, LCA 2012, BallaratPg92 HA, LCA 2012, Ballarat
Pg92 HA, LCA 2012, Ballarat
 
Managing terabytes
Managing terabytesManaging terabytes
Managing terabytes
 
Mistakes were made
Mistakes were madeMistakes were made
Mistakes were made
 
Postgres needs an aircraft carrier
Postgres needs an aircraft carrierPostgres needs an aircraft carrier
Postgres needs an aircraft carrier
 
Mistakes were made
Mistakes were madeMistakes were made
Mistakes were made
 
Harder, better, faster, stronger: PostgreSQL 9.1
Harder, better, faster, stronger: PostgreSQL 9.1Harder, better, faster, stronger: PostgreSQL 9.1
Harder, better, faster, stronger: PostgreSQL 9.1
 
How to ask for money
How to ask for moneyHow to ask for money
How to ask for money
 
Letters from the open source trenches - Postgres community
Letters from the open source trenches - Postgres communityLetters from the open source trenches - Postgres community
Letters from the open source trenches - Postgres community
 
Own it: working with a changing open source community
Own it: working with a changing open source communityOwn it: working with a changing open source community
Own it: working with a changing open source community
 
Managing terabytes: When Postgres gets big
Managing terabytes: When Postgres gets bigManaging terabytes: When Postgres gets big
Managing terabytes: When Postgres gets big
 
Managing terabytes: When PostgreSQL gets big
Managing terabytes: When PostgreSQL gets bigManaging terabytes: When PostgreSQL gets big
Managing terabytes: When PostgreSQL gets big
 
Pdxpugday2010 pg90
Pdxpugday2010 pg90Pdxpugday2010 pg90
Pdxpugday2010 pg90
 
Making Software Communities
Making Software CommunitiesMaking Software Communities
Making Software Communities
 
Illustrated buffer cache
Illustrated buffer cacheIllustrated buffer cache
Illustrated buffer cache
 
Bucardo
BucardoBucardo
Bucardo
 
How a bunch of normal people Used Technology To Repair a Rigged Election
How a bunch of normal people Used Technology To Repair a Rigged ElectionHow a bunch of normal people Used Technology To Repair a Rigged Election
How a bunch of normal people Used Technology To Repair a Rigged Election
 
Open Source Bridge Opening Day
Open Source Bridge Opening DayOpen Source Bridge Opening Day
Open Source Bridge Opening Day
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 

What Assumptions Make: Filesystem I/O from a database perspective

  • 1. What Assumptions Make: Performance Testing with P4 Portland PostgreSQL Performance Pad Selena Deckelmann selena@endpoint.com End Point Corporation twitter: @selenamarie
  • 2. www.endpoint.com 09 20 2, -2 21 b Fe 7x E AL SC
  • 3. www.endpoint.com 09 20 2, -2 21 b Fe 7x E AL SC
  • 4. www.endpoint.com 09 20 2, -2 21 b Fe 7x E AL SC
  • 5. Do filesystems do what we expect? SC AL E 7x Fe b 21 www.endpoint.com -2 5 2, 20 09
  • 6. We are volunteers. SC AL E 7x Fe b 21 www.endpoint.com -2 6 2, 20 09
  • 7. We think you should run these tests. SC AL E 7x Fe b 21 www.endpoint.com -2 7 2, 20 09
  • 8. We are: DBAs Sysadmins Performance tuners SC AL E 7x Fe b 21 www.endpoint.com -2 8 2, 20 09
  • 9. How will this hardware perform? SC AL E 7x Fe b 21 www.endpoint.com -2 9 2, 20 09
  • 10. How will this filesystem perform? SC AL E 7x Fe b 21 www.endpoint.com -2 10 2, 20 09
  • 11. Why should you care about filesystem-specific performance? SC AL E 7x Fe b 21 www.endpoint.com -2 11 2, 20 09
  • 12. Expectations SC AL E 7x Fe b 21 www.endpoint.com -2 12 2, 20 09
  • 13. PERSONAL CONFESSION SC AL E 7x Fe b 21 www.endpoint.com -2 13 2, 20 09
  • 14. Where to start? SC AL E 7x Fe b 21 www.endpoint.com -2 14 2, 20 09
  • 15. The Defaults. SC AL E 7x Fe b 21 www.endpoint.com -2 15 2, 20 09
  • 17. Not addressing reliability SC AL E 7x Fe b 21 www.endpoint.com -2 17 2, 20 09
  • 18. Very Narrow Use Case: A Relational Database SC AL E 7x Fe b 21 www.endpoint.com -2 18 2, 20 09
  • 19. Need for periodic testing. (And we've got some hardware!) SC AL E 7x Fe b 21 www.endpoint.com -2 19 2, 20 09
  • 20. ★Kernel differences ★FS patch-level differences ★Mount options ★mkfs options SC AL E 7x Fe b 21 www.endpoint.com -2 20 2, 20 09
  • 21. Focused on THROUGHPUT (Because that’s what people who buy large systems look for) SC AL E 7x Fe b 21 www.endpoint.com -2 21 2, 20 09
  • 22. Later: Response Time Operations per second SC AL E 7x Fe b 21 www.endpoint.com -2 22 2, 20 09
  • 23. No, we will not be testing ZFS. SC AL E 7x Fe b 21 www.endpoint.com -2 23 2, 20 09
  • 24. FS BtrFS (nope, not yet) SC AL E 7x Fe b 21 www.endpoint.com -2 24 2, 20 09
  • 25. What do we expect? SC AL E 7x Fe b 21 www.endpoint.com -2 25 2, 20 09
  • 26. Some conventional wisdom: SC AL E 7x Fe b 21 www.endpoint.com -2 26 2, 20 09
  • 27. “RAID5 is the worst choice for a database.” SC AL E 7x Fe b 21 www.endpoint.com -2 27 2, 20 09
  • 28. “LVM incurs too much overhead to use.” SC AL E 7x Fe b 21 www.endpoint.com -2 28 2, 20 09
  • 29. “Striping doubles performance.” SC AL E 7x Fe b 21 www.endpoint.com -2 29 2, 20 09
  • 30. “Turning off 'atime' is a big performance gain.” SC AL E 7x Fe b 21 www.endpoint.com -2 30 2, 20 09
  • 31. “Getting rid of atime updates would give us more everyday Linux performance than all the pagecache speedups of the last 10 years, _combined_.” SC AL E 7x Fe b 21 www.endpoint.com -2 31 2, 20 09
  • 32. “Journaling filesystems (ext3) will have worse performance than non- journaling filesystems (ext2).” SC AL E 7x Fe b 21 www.endpoint.com -2 32 2, 20 09
  • 33. “Your read-ahead buffer is big enough.” SC AL E 7x Fe b 21 www.endpoint.com -2 33 2, 20 09
  • 34. Now... on to the good stuff. SC AL E 7x Fe b 21 www.endpoint.com -2 34 2, 20 09
  • 35. www.endpoint.com 35 09 20 2, -2 21 b Fe 7x E AL SC
  • 36. PostgreSQL’s Portland Performance Pad SC AL Hosted by CommandPrompt, Inc. E 7x Fe b 21 www.endpoint.com -2 36 2, 20 09
  • 37. Our machine: HP ProLiant DL380G5 Smart Array p800 72GB 15,000 RPM SAS (up to 25 disks) 32GB RAM Linux: 2.6.25-gentoo-r6 *New tests being run with 2.6.28 SC AL E 7x Fe b 21 www.endpoint.com -2 37 2, 20 09
  • 38. Our machine: Chosen because of it’s low, low price. Thank you, HP. SC AL E 7x Fe b 21 www.endpoint.com -2 38 2, 20 09
  • 39. Our tests: fio 64 GB working set 8 threads no fadvise no direct i/o 8KB blocksize I/O elevator: deadline SC AL E 7x Fe b 21 www.endpoint.com -2 39 2, 20 09
  • 40. Our stats: sar mpstat iostat vmstat readprofile SC AL E 7x Fe b 21 www.endpoint.com -2 40 2, 20 09
  • 41. Our tests: Chosen because of their relevance to PostgreSQL SC AL E 7x Fe b 21 www.endpoint.com -2 41 2, 20 09
  • 42. Filesystems Tested: ext2 ext3 jfs xfs reiserfs ext4 (but having trouble) SC AL E 7x Fe b 21 www.endpoint.com -2 42 2, 20 09
  • 43. Disk configs tested: Single disk RAID-0 RAID-1 RAID-5 RAID-10 RAID-6 SC AL E 7x Fe b 21 www.endpoint.com -2 43 2, 20 09
  • 44. The Data: http://moourl.com/fsperf SC AL E 7x Fe b 21 www.endpoint.com -2 44 2, 20 09
  • 45. Confessions: • May be high standard deviation with results (don’t know yet!) •No filesystem tuning, all default create and mount options •No software raid comparison or lvm (volume management test) for 2.6.28 tests SC AL E 7x Fe b 21 www.endpoint.com -2 45 2, 20 09
  • 46. Confessions: • Some xfs runs had to be repeated and some ext4 runs did not complete successfully • Only presenting throughput • Interested in system performance for a specific application, not code performance SC AL E 7x Fe b 21 www.endpoint.com -2 46 2, 20 09
  • 47. Confessions: •I/O profiles don’t exhibit atime or partition alignment issues •Disk controller firmware not at the latest version in 2.6.25 tests •Software RAID is on top of 1 disk RAID 0 devices (HP SmartArray doesn’t have JBOD option) SC AL E 7x Fe b 21 www.endpoint.com -2 47 2, 20 09
  • 48. AUDIENCE PARTICIPATION Higher throughput: ext2 or ext3? SC AL E 7x Fe b 21 www.endpoint.com -2 48 2, 20 09
  • 49. www.endpoint.com 49 09 20 2, -2 21 b Fe 7x E AL SC
  • 50. www.endpoint.com 50 09 20 2, -2 21 b Fe 7x E AL SC
  • 51. www.endpoint.com 51 09 20 2, -2 21 b Fe 7x E AL SC
  • 52. www.endpoint.com 52 09 20 2, -2 21 b Fe 7x E AL SC
  • 53. Seek bundling/batching in ext3 is better? SC AL E 7x Fe b 21 www.endpoint.com -2 53 2, 20 09
  • 54. What if we add a disk? SC AL E 7x Fe b 21 www.endpoint.com -2 54 2, 20 09
  • 55. www.endpoint.com 55 09 20 2, -2 21 b Fe 7x E AL SC
  • 56. www.endpoint.com 56 09 20 2, -2 21 b Fe 7x E AL SC
  • 57. www.endpoint.com 57 09 20 2, -2 21 b Fe 7x E AL SC
  • 58. www.endpoint.com 58 09 20 2, -2 21 b Fe 7x E AL SC
  • 59. www.endpoint.com 59 09 20 2, -2 21 b Fe 7x E AL SC
  • 60. AUDIENCE PARTICIPATION RAID 0 (stripe) versus RAID 1 (mirroring) performance? SC AL E 7x Fe b 21 www.endpoint.com -2 60 2, 20 09
  • 61. www.endpoint.com 61 09 20 2, -2 21 b Fe 7x E AL SC
  • 62. www.endpoint.com 62 09 20 2, -2 21 b Fe 7x E AL SC
  • 63. www.endpoint.com 63 09 20 2, -2 21 b Fe 7x E AL SC
  • 64. What happens when we: add disks to a RAID 0 (stripe) LUN? SC AL E 7x Fe b 21 www.endpoint.com -2 64 2, 20 09
  • 65. www.endpoint.com 65 09 20 2, -2 21 b Fe 7x E AL SC
  • 66. www.endpoint.com 66 09 20 2, -2 21 b Fe 7x E AL SC
  • 67. www.endpoint.com 67 09 20 2, -2 21 b Fe 7x E AL SC
  • 68. www.endpoint.com 68 09 20 2, -2 21 b Fe 7x E AL SC
  • 69. Adding disks to a RAID 5 LUN SC AL E 7x Fe b 21 www.endpoint.com -2 69 2, 20 09
  • 70. www.endpoint.com 70 09 20 2, -2 21 b Fe 7x E AL SC
  • 71. www.endpoint.com 71 09 20 2, -2 21 b Fe 7x E AL SC
  • 72. www.endpoint.com 72 09 20 2, -2 21 b Fe 7x E AL SC
  • 73. Only have 4 disks? What should you do? SC AL E 7x Fe b 21 www.endpoint.com -2 73 2, 20 09
  • 74. www.endpoint.com 74 09 20 2, -2 21 b Fe 7x E AL SC
  • 75. www.endpoint.com 75 09 20 2, -2 21 b Fe 7x E AL SC
  • 76. www.endpoint.com 76 09 20 2, -2 21 b Fe 7x E AL SC
  • 77. www.endpoint.com 77 09 20 2, -2 21 b Fe 7x E AL SC
  • 78. In most cases, RAID 5 out performs on sequential writes (xlog). Random writes is only an improvement on xfs and reiserfs. SC AL E 7x Fe b 21 www.endpoint.com -2 78 2, 20 09
  • 79. Are software RAID and LVM are slow? SC AL E 7x Fe b 21 www.endpoint.com -2 79 2, 20 09
  • 80. www.endpoint.com 80 09 20 2, -2 21 b Fe 7x E AL SC
  • 81. www.endpoint.com 81 09 20 2, -2 21 b Fe 7x E AL SC
  • 82. The Read-ahead buffer SC AL E 7x Fe b 21 www.endpoint.com -2 82 2, 20 09
  • 83. AUDIENCE PARTICIPATION Readahead buffer: Default is 128 K What do you think it should be? SC AL E 7x Fe b 21 www.endpoint.com -2 83 2, 20 09
  • 84. www.endpoint.com 84 09 20 2, -2 21 b Fe 7x E AL SC
  • 85. And is there a cost to increasing the buffer that much? SC AL E 7x Fe b 21 www.endpoint.com -2 85 2, 20 09
  • 86. www.endpoint.com 86 09 20 2, -2 21 b Fe 7x E AL SC
  • 87. http://moourl.com/readaheadconfirm SC AL E 7x Fe b 21 www.endpoint.com -2 87 2, 20 09
  • 88. Future Work •OLTP system characterization, sizing •Daily OLTP regression testing •More presentations •P5 - PostgreSQL Portland Performance Pad PRACTICE SC AL E 7x Fe b 21 www.endpoint.com -2 88 2, 20 09
  • 89. MOAR Hardware? Thanks again, HP! MSA70, DL380 in 2009 ?? SC AL E 7x Fe b 21 www.endpoint.com -2 89 2, 20 09
  • 90. Let’s recap... SC AL E 7x Fe b 21 www.endpoint.com -2 90 2, 20 09
  • 91. “RAID5 is the worst choice for a database.” Fast for sequential writes in our tests. “LVM incurs too much overhead to use. Software RAID is slower.” For reads – throughput is about the same, but saw higher CPU. “Turning off 'atime' is a big performance gain.” Not in our tests. But, 2-3% for “free”. SC AL E 7x Fe b 21 www.endpoint.com -2 91 2, 20 09
  • 92. “Journaling filesystems will have worse performance than non-journaling filesystems.” Turn the data journaling off on ext3, and you do see better performance, but there are edge cases and performance differences we could not explain. “Striping doubles performance.” Performance is better, but no where near double. Why? SC AL E 7x Fe b 21 www.endpoint.com -2 92 2, 20 09
  • 93. “Your read-ahead buffer is big enough.” Your read-ahead buffer IS NOT big enough. Make it 8MB. And can we make that the default? SC AL E 7x Fe b 21 www.endpoint.com -2 93 2, 20 09
  • 94. Thank you! Results: http://wiki.postgresql.org/wiki/ HP_ProLiant_DL380_G5_Tuning_Guide http://moourl.com/fsperf Selena Deckelmann selena@endpoint.com twitter: @selenamarie SC AL E 7x Fe b 21 www.endpoint.com -2 94 2, 20 09