Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Mastering PostgreSQL with AWSJafar ShameemBusiness Development Manager,Amazon Web ServicesMiles WardSenior Manager, Soluti...
Agenda• AWS Storage Options and EBS• EBS Provisioned IOPS• About Postgres• Postgres on AWS best practices• Lessons learned...
Storage Options on AWSBlock Storage(Elastic Block Store)Object Storage(S3, Glacier)Use for:• Access to rawunformatted bloc...
Amazon Elastic Block Store (EBS)Elastic Block Storage: Persistent Storage for EC2High performance block storagedeviceMount...
Standard and Provisioned IOPS Volume TypesStandard Volumes Provisioned IOPS VolumesOptimizedforWorkloads with low ormodera...
Provisioned IOPSVolumesIntroducing
Introducing Provisioned IOPS Volumes❶ Select a new type of Provisioned IOPS volumes❸ Specify the number of IOs per second ...
What are customers running on EBS?EnterprisesEnterpriseworkloadsare built onblock storageOracle, SAP,MicrosoftApplications...
PostgreSQL• Open-source RDBMS• Rich features• Extraordinary stability• Focus on performance• Full ACID compliance for appl...
PostgreSQL onAWSBest Practices
Concepts• Master PostgreSQL hosto Accepts both reads and writeso May have many replicaso Records transferred to replicas u...
Installation• Start with an Amazon Machine Image (AMI) of your choice• Launch EC2 instance and attach EBS volume to it• In...
Temporary data / SSD Storage• You can create a normal tablespace on instance storage withUNLOGGED tables in it to take adv...
Architecture – Building Blocks• Master• Streaming Replication
Replication Basics• Records are transferred to the replicas via Write-Ahead Logging (WAL)• Replication can be real-time th...
Architecture – Production Designs• Functional Partitioning• Vertically scale to largest EC2 instance and storage• Tune for...
Architecture – Anti-patterns• Vertical Scaling does not offer all the benefits of horizontal scaling• Scaling step-by-step...
Performance – Minimum production scale• Always use Elastic Block Store(EBS)o Significant write cacheo Superior random IO p...
Performance – Larger production scale• Move up to higher bandwidthinstance types (m1.xlarge,c1.xlarge, m2.4xlarge)• Increa...
Performance – Extra-large scale• Leverage Cluster Computeinstance typeso More bandwidth to EBSo Ex. CC2 will makeexcellent...
Performance – Extra-large productionscale• Can also leverage SSDinstance type (hi1.4xlarge)o 2 x 1 TB SSD storage(ephemera...
Benchmarking storage• Sequential test example:o dd if=/dev/zero of=<location in the disk> bs=8192 count=10000oflag=direct•...
Benchmarking storage throughPostgreSQL• Use pgbench• Install the set with the respective scale:o pgbench -i -s1000 -Upostg...
Backups using EC2 snapshots• Snapshot mounted volume:o SELECT pg_start_backup(‘label’,true);o ec2-create-snapshot -d "post...
Restores using a EC2 snapshot• Check available snapshoto $ ec2-describe-snapshots• Create EBS volumes from each snapshot u...
Tunables• Swappiness, vm, kernel tuningo By default shmmax and shmall have really small values. Thosevalues are linked to ...
Tunables• WALo It’s strongly recommend to separate the data from the pg_xlog (WAL) folder.For the WAL files we recommend s...
Tunables• Memory Tuningo shared_buffers is the most important and difficult memory variable totune up. A fast recommendati...
Monitoring• Use CloudWatch service to monitor –o checkpoint_segments warningso Number of connectionso Memory usage and loa...
Security• Disk Encryptiono Filesystem or OS tools• Row level Encryptiono pgcrypto• SSL• Authentication and Network• And IA...
Lessons learnedfrom OFA campaign
Lessons Learned• Use the best practices mentioned earlier• Use Provisioned IOPS• AWS Enterprise Support is definitely wort...
• Slides will be made available here:o http://aws.amazon.com/ebs/webinars/• Benchmarking Postgres with EBS 4000 IOPS/volum...
Upcoming SlideShare
Loading in …5
×

AWS Webcast - Achieving consistent high performance with Postgres on Amazon Web Services using EBS Provisioned IOPS

6,947 views

Published on

Postgres is a popular relational database and is the backend of a number of high traffic applications. Join AWS and PalominoDB, the company that helped Obama for America campaign optimize the database infrastructure on AWS, to learn about how you can run high throughput, I/O intensive Postgres clusters on the Amazon EBS storage platform. We will go over best practices including performance, durability and optimization related to deploying Postgres on AWS.

You hear about the best practices learned and applied for the Obama for America campaign.

In this webinar, you will learn about:
- Amazon Elastic Block Store (EBS)
- Why Provisioned IOPS volumes fit the needs of high I/O intensive applications
- Best practices for deploying Postgres on AWS
- How to leverage Provisioned IOPS volumes for Postgres

  • Be the first to comment

AWS Webcast - Achieving consistent high performance with Postgres on Amazon Web Services using EBS Provisioned IOPS

  1. 1. Mastering PostgreSQL with AWSJafar ShameemBusiness Development Manager,Amazon Web ServicesMiles WardSenior Manager, Solutions Architecture,Amazon Web ServicesJay EdwardsCTO, PalominoDB
  2. 2. Agenda• AWS Storage Options and EBS• EBS Provisioned IOPS• About Postgres• Postgres on AWS best practices• Lessons learned from the OFA campaign
  3. 3. Storage Options on AWSBlock Storage(Elastic Block Store)Object Storage(S3, Glacier)Use for:• Access to rawunformatted blocklevel storage• Persistent StorageUse for:• Pictures, videos,highly durablemedia storage• Cold storage forlong-term archive
  4. 4. Amazon Elastic Block Store (EBS)Elastic Block Storage: Persistent Storage for EC2High performance block storagedeviceMount as drives to instancesPersistent and independent ofinstance lifecycleFeature DetailsHighperformancefile systemMount EBS as drives andformat as requiredFlexible size Volumes from 1GB to 1TB insizeSecure Private to your instancesAvailable Replicated within anAvailability ZoneBackups Volumes can be snapshottedfor point in time restoreMonitoring Detailed metrics captured viaCloud Watch
  5. 5. Standard and Provisioned IOPS Volume TypesStandard Volumes Provisioned IOPS VolumesOptimizedforWorkloads with low ormoderate IOPS needs andoccasional bursts.Transactional workloads requiringconsistent IOPS.VolumeAttributesUp to 1 TB, average 100 IOPSper volume. Best effortperformance. Can be stripedtogether for larger size andhigher IOPS.Up to 1TB, 4,000 IOPS per volume.Consistent IOPS. Can be stripedtogether for larger size and higherIOPS.Workloads File server, Log processing,Websites, Analytics, Boot, etc.Business applications, MongoDB,SQL server, MySQL, Postgres,Oracle, etc.
  6. 6. Provisioned IOPSVolumesIntroducing
  7. 7. Introducing Provisioned IOPS Volumes❶ Select a new type of Provisioned IOPS volumes❸ Specify the number of IOs per second yourapplication needs, up to 4000 PIOPS pervolume. The volume will deliver the specifiedIO per second.❷ Specify the volume capacity$ ec2-create-volume --size 500 --availability-zone us-east-1b --type io1 –iops 2000
  8. 8. What are customers running on EBS?EnterprisesEnterpriseworkloadsare built onblock storageOracle, SAP,MicrosoftApplicationsConvenient,cost-effective,reliable fileserverGaming/Social/Mobile/EducationVery highperformanceandconsistent IOfor NoSQLandrelationalDBsMarketing /AnalyticsFastsequential IOaccess
  9. 9. PostgreSQL• Open-source RDBMS• Rich features• Extraordinary stability• Focus on performance• Full ACID compliance for applications requiring durability andavailability• Robust GIS functionality
  10. 10. PostgreSQL onAWSBest Practices
  11. 11. Concepts• Master PostgreSQL hosto Accepts both reads and writeso May have many replicaso Records transferred to replicas using Write-Ahead logging (WAL)• Secondary PostgreSQL hosto Receives WAL records from the mastero Replication can be real-time or delayed• Hot standbyo A secondary host that can receive read queries
  12. 12. Installation• Start with an Amazon Machine Image (AMI) of your choice• Launch EC2 instance and attach EBS volume to it• Install software from ftp.postgresql.org• Edit EC2 security group to allow ingress for port 5432• Edit postgres.conf for:o listen_addresses = ‘*’• For master-slave configurations:o Set max_wal_senders > 0
  13. 13. Temporary data / SSD Storage• You can create a normal tablespace on instance storage withUNLOGGED tables in it to take advantage of increased performanceavailable with SSDs• When you create a new table, query the relfilenode of the new table andbackup the file system identified by the query results into permanentstorage. (Be sure to do this before you put any data in the table).
  14. 14. Architecture – Building Blocks• Master• Streaming Replication
  15. 15. Replication Basics• Records are transferred to the replicas via Write-Ahead Logging (WAL)• Replication can be real-time through “streaming replication” or delayedvia “WAL archiving• Replication on PostgreSQL supports two levels of durability:asynchronous and synchronous. Only one replica can be insynchronous mode. You may, however, provide an ordered list ofcandidate synchronous replicas if the primary replica is down.• Since version 9.2, PostgreSQL has supported Cascading Replication
  16. 16. Architecture – Production Designs• Functional Partitioning• Vertically scale to largest EC2 instance and storage• Tune for the available hardware• Use replication to create multiple replicas if bound by reads• Shard your data-sets if bound by writes
  17. 17. Architecture – Anti-patterns• Vertical Scaling does not offer all the benefits of horizontal scaling• Scaling step-by-step when you know you need a big system is notefficient• ACID compliance has a cost. Consider NoSQL data stores for logs orsession data• Might not need to do everything in the DB.
  18. 18. Performance – Minimum production scale• Always use Elastic Block Store(EBS)o Significant write cacheo Superior random IO performanceo Enhanced durability compared toinstance stores
  19. 19. Performance – Larger production scale• Move up to higher bandwidthinstance types (m1.xlarge,c1.xlarge, m2.4xlarge)• Increase EBS volume size to >300 GB• Increase number of volumes inRAID set
  20. 20. Performance – Extra-large scale• Leverage Cluster Computeinstance typeso More bandwidth to EBSo Ex. CC2 will makeexcellent primary nodes,particularly when pairedwith a large number ofEBS volumes (= 8)• Improve RAID configurationwith:o effective_io_concurrency= # of stripes in RAID set
  21. 21. Performance – Extra-large productionscale• Can also leverage SSDinstance type (hi1.4xlarge)o 2 x 1 TB SSD storage(ephemeral storage)o Perfect for replicas• If replicas on SSD instancetypes, disable integrityfeatures such as fsync andfull_page_writes on thosehosts to improveperformance
  22. 22. Benchmarking storage• Sequential test example:o dd if=/dev/zero of=<location in the disk> bs=8192 count=10000oflag=direct• Seek test example:o sysbench --num-threads=16 --test=fileio --file-total-size=3G --file-test-mode=rndrw prepareo sysbench --num-threads=16 --test=fileio --file-total-size=3G --file-test-mode=rndrw --file-fsync-all runo sysbench --num-threads=16 --test=fileio --file-total-size=3G --file-test-mode=rndrw cleanupo For more aggressive tests, add --file-sync-all option, especially ifcomparing different filesystems (ex. ext4 vs XFS)
  23. 23. Benchmarking storage throughPostgreSQL• Use pgbench• Install the set with the respective scale:o pgbench -i -s1000 -Upostgres database• Run a simple test with 20 clients with 100 transactions each against themastero pgbench -c 20 -t 100 -Upostgres database• Run a “only-read/no vacuum” test against the slave:o pgbench -S -n -c 20 -t 1000 -h slave -Upostgres database• If planning on using pgpool, test against it instead of DB
  24. 24. Backups using EC2 snapshots• Snapshot mounted volume:o SELECT pg_start_backup(‘label’,true);o ec2-create-snapshot -d "postgres clon" vol-24592c0eo SELECT pg_stop_backup();• If operating near maximum I/O capacity, it isrecommended to use a replica for backups
  25. 25. Restores using a EC2 snapshot• Check available snapshoto $ ec2-describe-snapshots• Create EBS volumes from each snapshot used to backup the DBo $ ec2-create-volume --snapshot snap-219c1308 --availability-zoneeu-west-1c• Attach volumes to instanceso $ ec2-attach-volume -i i-96ec5edd -d /dev/sdc vol-eb1561c1• If using RAID set, replace volumes in same order for easiest re-creationof the RAID volume in the OS• Mount instance and assign corresponding permissions
  26. 26. Tunables• Swappiness, vm, kernel tuningo By default shmmax and shmall have really small values. Thosevalues are linked to shared_buffers in postgresql.conf, if this value ishigher than the kernel parameters, the PostgreSQL won’t start.o vm.swappiness is recommended to be setup with a value under 5.This setting will avoid use swap unless is really necessary.• File System Tuningo XFS (nobarrier,noatime,noexec,nodiratime)o EXT3/4• You can use ext3 or non journaled file systems for logs.
  27. 27. Tunables• WALo It’s strongly recommend to separate the data from the pg_xlog (WAL) folder.For the WAL files we recommend strongly XFS filesystem, due to the highamount of fsync generated.o checkpoint_segments. The value of this variable will depend strictly on theamount of data modified on the instance. At the beginning, you can start witha moderate value and monitor the logs looking for HINTSo File segments are 16MB each so it will be easy to fill them if you have batchof processes adding or modifying data. You could easily need more than 30on a busy server.o We recommend not using ext3 file system if you plan to have the WALs inthe same directory as the data. fsync calls are handled inefficiently by this filesystem.
  28. 28. Tunables• Memory Tuningo shared_buffers is the most important and difficult memory variable totune up. A fast recommendation could be start with ¼ of your RAM.• PGTune is a python script that recommends a configurationaccording the hardware on your server.o https://github.com/gregs1104/pgtune/archive/master.zip
  29. 29. Monitoring• Use CloudWatch service to monitor –o checkpoint_segments warningso Number of connectionso Memory usage and load averageo Slow querieso Replication laghttp://docs.aws.amazon.com/AmazonCloudWatch/latest/DeveloperGuide/mon-scripts-perl.html
  30. 30. Security• Disk Encryptiono Filesystem or OS tools• Row level Encryptiono pgcrypto• SSL• Authentication and Network• And IAM!!
  31. 31. Lessons learnedfrom OFA campaign
  32. 32. Lessons Learned• Use the best practices mentioned earlier• Use Provisioned IOPS• AWS Enterprise Support is definitely worth the cost• Inventory Management is underrated – it’s magic!• Trusted Advisor is much better than it used to be• AWS Product Lifecycleo Starts off not so goodo Gets LOTS better• Hard to keep up to date with every feature of every product
  33. 33. • Slides will be made available here:o http://aws.amazon.com/ebs/webinars/• Benchmarking Postgres with EBS 4000 IOPS/volumeo http://palominodb.com/blog/2013/05/08/benchmarking-postgres-aws-4000-piops-ebs-instances• Creating consistent EBS snapshots with MySQL and XFS on Ecso http://alestic.com/2009/09/ec2-consistent-snapshot• Understanding Amazon EBS Availability and Performanceo http://www.slideshare.net/AmazonWebServices/understanding-ebs-availabilityandperformance• Benchmarking EBS performance:o http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSPerformance.htmlGet started on Provisioned IOPStoday!aws.amazon.com/ebsQuestions: e-mail: shameemj@amazon.com

×