MongoDB and Amazon Web Services: Storage Options for MongoDB Deployments

4,763 views

Published on

When using MongoDB and AWS, you want to design your infrastructure to avoid storage bottlenecks and make the best use of your available storage resources. AWS offers a myriad of storage options, including ephemeral disks, EBS, Provisioned IOPS, and ephemeral SSD's, each offering different performance and persistence characteristics. In this session, we’ll evaluate each of these options in the context of your MongoDB deployment, assessing the benefits and drawbacks of each.

Published in: Technology

MongoDB and Amazon Web Services: Storage Options for MongoDB Deployments

  1. 1. MongoDB and AWS Storage Configurations Senior Solutions Architect, MongoDB Inc. Sandeep Parikh #mongodb
  2. 2. Quick Recap • Deployment and Availability – MongoDB Basics – Deployment Configurations – Instance Types – Best Practices • Slides and Recording: – http://www.mongodb.com/presentations/mongodb-and- amazon-web-services-deploying-high-availability
  3. 3. Agenda • Storage Options • Simple Recommendations • Backup and Restore • Advanced Configurations • Drawbacks/Tradeoffs • Next Steps
  4. 4. Storage Options
  5. 5. AWS Storage Options • Instance-based (ephemeral) • Elastic Block Store (persistent) • Simple Storage Service (S3) • Glacier
  6. 6. MongoDB Storage Elements • Data • Journal • Logs • Snapshots • Archived Backups
  7. 7. Instance • Data • Log • Journal EBS • Data • Log • Journal • Snapshots S3 • Snapshots • Archived Backups Glacier • Archived Backups MongoDB Elements & AWS Storage Data Lifecycle
  8. 8. Instance Storage • Ephemeral – If you’re instance is stopped or terminated, ephemeral storage is lost (!) • Configurations – Single or multiple volumes per instance • Management – LVM for RAID or snapshots
  9. 9. EBS • Persistent – Allocated and attached to individual instances like network-attached storage – Storage lifecycle independent of instances • Configuration – Single or multiple volumes per instance • Management – LVM or MD for RAID – EBS Snapshots (Console or API)
  10. 10. Standard EBS Standard volumes are designed for applications with moderate I/O requirements. They are also well-suited for use as boot volumes or applications where I/O can be bursty. • Performance is somewhat variable • Average of 100 IOPS • Possible to aggregate via RAID but underlying bursty nature still exists
  11. 11. Provisioned IOPS EBS Provisioned IOPS volumes offer storage with consistent and low-latency performance, and are designed for applications with I/O-intensive workloads such as databases. • Consistent volume I/O performance • Available with 100-4000 IOPS per volume • Launch with EBS-Optimized – Adds additional network bandwidth for EBS volumes
  12. 12. Measuring IOPS • Volumes are optimized for 4 KB per operation • MongoDB document sizes and workload patterns will affect throughput • Use mongoperf to test disk configuration – Threads – Data file size – Document size
  13. 13. Simple Recommendations
  14. 14. Multiple EBS Volumes • Provisioned IOPS EBS • EBS-optimized • Separate volumes for – Data – Journal – Log • Decrease disk contention during high load
  15. 15. Disk Configurations • Mirror or stripe multiple disks (or both) – LVM – MDADM • Different implications for each RAID level – Durability – Performance – Cost
  16. 16. Aggregating IOPS • Single volumes capable of 4000 IOPS • Stripe volumes to aggregate IOPS (RAID0, RAID10) • Note: network bandwidth is the limiting factor
  17. 17. MongoDB on AWS Marketplace
  18. 18. MongoDB on AWS Marketplace
  19. 19. MongoDB Configurations • Follows MongoDB best practices – Amazon Linux, MongoDB installed via yum – EBS PIOPS volumes per mount (data, log, journal) – Configured: ulimits, read ahead, keep alive Config Data Log Journal Size IOPS Size IOPS Size IOPS 1000 IOPS 200 GB 1000 10 GB 100 25 GB 250 2000 IOPS 200 GB 2000 15 GB 150 25 GB 250 4000 IOPS 400 GB 4000 20 GB 200 25 GB 250
  20. 20. Backup and Restore
  21. 21. Data Safety • What’s your backup plan? • Have you tested restoring? • Is your data highly available? • How do you recover from disaster?
  22. 22. Protecting Your Data • Replica Sets – Proper deployments provide HAand DR • Manual backup/restore – Scriptable, tuneable • MMS Backup – Continuous, secure backup
  23. 23. Manual Backup Procedures EBS • EBS Snapshots • LVM Snapshots Ephemeral • LVM Snapshots Note: • EBS snapshots can be done “hot” but for MongoDB it’s better to fsyncLock() • LVM snapshots require enough free space on instance to store snapshot
  24. 24. Restore • Boot new or use existing instance • Create new volume from EBS snapshot and attach or • Copy over LVM snapshot and create/mount LV
  25. 25. LVM • Copy snapshots to S3 bucket • Create lifecycle rules to move data from bucket to Glacier EBS • Mount volume from snapshot • Copy volume data to S3 bucket • Create lifecycle rules to move data from bucket to Glacier Archiving Backups
  26. 26. MongoDB Management Service
  27. 27. MMS Backup
  28. 28. Fully-managed, agent-based, continuous backup Custom snapshot scheduling and retention Point-in-time recovery and consistent snapshots across sharded clusters Performance impact similar to Secondary Encrypted data transfer Restores require 2- factor authentication MMS Backup In-Depth
  29. 29. Advanced Configurations
  30. 30. Standard Ephemeral Storage • Remember, it’s ephemeral • Technically feasible • Lack of persistence is a big negative • Any benefits can’t outweigh the negatives
  31. 31. Ephemeral SSDs • Performance ceiling might outweigh typical negatives • Cost implications: SSD-backed instances are more expensive • Does your workload truly need flash? – Profile early and often to make this determination • How many drives do you need? – Drives instance choice
  32. 32. RAID SSD and MongoDB Configurations SSD mongod SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD mongod mongod mongod mongod mongod mongod mongod mongod
  33. 33. SSD Deployment Strategies • SSD deployments – Replica Sets and – MMS Backup • High performance • Highly available • Continuous backup mongod Primary mongod Secondar y mongod Secondar y MMS Backup Agent
  34. 34. SSD Deployment Considerations • One Secondary could use EBS • Will need to have an instance with – High network bandwidth and – Mutliple EBS volumes aggregated to approach IOPS parity • Key is avoiding significant replication lag because of IO performance dropoff
  35. 35. Drawbacks & Tradeoffs
  36. 36. Considerations • Performance • Consistency • Safety • Flexibility • Scalability
  37. 37. Best Practices • Prototype > Test > Scale • IO on AWS is easy to scale • AWS makes it easy to iterate deployment – Start small – Profile your workload – Remove all other bottlenecks – Add instance and IO capacity
  38. 38. Recommended Starting Points • EBS-Optimized and PIOPS EBS • M1.large is an effective starting point for profiling an early production deployment • Use volumes with 250 or 500 IOPS for data to start – A dding more IOPS is as easy – Snapshot and recreate with more capacity
  39. 39. Questions?
  40. 40. Resources • MMS Monitoring and Backup – http://mms.mongodb.com • MongoDB on AWS best practices: – http://bit.ly/deploy-mongodb-ec2 • MongoDB on AWS Marketplace: – http://bit.ly/aws-marketplace-mongodb • MongoDB docs – http://docs.mongodb.org
  41. 41. MongoDB World New York City, June 23-25 #MongoDBWorld See what’s next in MongoDB including • MongoDB 2.6 • Sharding • Replication • Aggregation http://world.mongodb.com Save 25% with discount code 25SandeepParikh

×