Scaling MongoDB on Amazon Web Services (DAT209) | AWS re:Invent 2013


Published on

Over the past year, mobile in-app feedback provider Apptentive has scaled MongoDB on AWS from a single machine to a sharded, thousands-of-operations-per-second, several hundred gigabyte cluster. This session—packed with demos, code, and actual performance numbers—shares the lessons learned along the way. Topics include picking the right tools for the job (instance sizing and selection, I/O choices, and topological choices); using chef/AWS OpsWorks and AWS CloudFormation to deploy and scale; monitoring with Amazon CloudWatch and MMS; managing backups with Amazon EBS snapshots; and using Amazon Elastic MapReduce alongside MongoDB instances.

Published in: Technology, Design

Scaling MongoDB on Amazon Web Services (DAT209) | AWS re:Invent 2013

  1. 1. DAT209 - Scaling MongoDB on Amazon Web Services Michael Saffitz, CTO & Co-Founder, Apptentive November 15, 2013 © 2013, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of, Inc.
  2. 2. Nice to Meet You! Mike Saffitz CTO, Co-Founder, Apptentive Follow at: @msaffitz • Connect at: Apptentive The easiest way for anyone with an app to talk with their customers Follow at: @apptentive • Connect at:
  3. 3. Apptentive & AWS
  4. 4. Apptentive & AWS Route53 CloudFront IAM S3 Web Servers EC2: 6 x c1.medium (Elastic Load Balancer) Elastic Beanstalk, RDS CloudWatch Elastic MapReduce VPN Server EC2: m1.small Stats & Logging EC2: 2x m1.medium m1.small Sharded MongoDB Cluster EC2: 9 Instances CI & Chef EC2: m1.medium m1.small Redis EC2: m1.medium Virtual Private Cloud
  5. 5. Agenda • Why Scale MongoDB on AWS? • Planning • Deploying • Maintaining
  6. 6. Why Scale MongoDB on AWS?
  7. 7. Why Scale MongoDB on AWS? Supports Diverse Set of Scenarios Rapidly Scale On Demand Simple To Administer Easy Friendly Query Syntax Well Documented Flexible Broad Language Support Competitive TCO Cost Effective Fine Grain Control Over Price & Performance
  8. 8. Why Not Scale MongoDB on AWS? Your Data is Predominately Relational in Nature Don’t Want to Incur the Administrative Costs Consider RDS Hosted Alternatives Consider DynamoDB
  9. 9. 1. Planning
  10. 10. Planning Checklist • Topologies – MongoDB – AWS • Instance Selection • Storage
  11. 11. MongoDB Topologies: Single Server mongod
  12. 12. MongoDB Topologies: Single ReplicaSet w/ Arbiter Automatic Failover mongod (primary) mongod (secondary) Contains Full Copy of Data on the Primary – Can be Used for Reads mongod (arbiter) Arbiter Only Participates in Voting to Elect a New Primary (Must Have Odd #)
  13. 13. MongoDB Topologies: Single ReplicaSet Automatic Failover mongod (primary) mongod (secondary) Scale Across Instance Types mongod (secondary) Data Replicated Within ReplicaSet
  14. 14. MongoDB Topologies: Sharded Cluster App Server mongos App Server … mongos mongod process config config config Data Partitioned Across Shards mongod (primary) mongod (secondary) mongod (secondary) Data Replicated Within Shard … mongod (primary) mongod (secondary) mongod (secondary)
  15. 15. MongoDB Topologies: Picking One • Single Server? Not For Production • Don’t Shard Prematurely – ReplicaSets can take you surprisingly far • … But Don’t Wait Too Long to Shard – Collections over 256GB may have issues migrating to shards – Rebalancing consumes IO and can be very slow • Pick the Right Instance Size for Your Topology… – We’re going to get to this in a moment
  16. 16. AWS Topologies: AZs & Regions • Obvious: Distribute Across Availability Zones in a Region – No Single Point of Failure • Distributing Across Regions – Shard per Region versus Shards Across Regions – Considerations • • • • Replication Latency Data Transfer Costs Administration Costs Speedup from Geo-Based Tag Aware Sharding
  17. 17. Selecting an Instance: Considerations Compute Memory EBS Optimized? Cost
  18. 18. Selecting an Instance: Compute • Most Likely to Not Be A Significant Factor – Exceptions: Heavy use of Map/Reduce, Aggregation Framework – Mongo 2.4 added concurrency via V8 – Important! Only run 64-Bit ; 32-Bit is limited to ~2GB • Real World Numbers on m1.large:
  19. 19. Selecting an Instance: Memory • Estimate Necessary Working Set – db.runCommand( { serverStatus: 1, workingSet: 1 } ) Is pagesInMemory * 4k approaching total RAM? Is overSeconds decreasing / small? – db.stats() • Pick the Instance that Matches • Monitor on MMS – Page Faults (abstract) – Queues (better) – Response Times (best)
  20. 20. Selecting an Instance: EBS Optimization • Run EBS Optimized When Available – Especially with Provisioned IOPs • Volume Config Impacts IO Perf Far More than Instance Selection
  21. 21. Storage • Instance Storage – Non-Durable – Fast But Inconsistent Performance – Can’t Use Snapshots for Backups • “Standard” EBS – Slower – Higher Variability Performance • Provisioned IOPs EBS – Consistent Performance – Don’t Under Provision -- Watch Queue Length
  22. 22. Storage • RAID 10? Just use LVM on RAID 0 – More: • Use XFS or Ext4 • Mount with noatime, noexec, nodiratime
  23. 23. Selecting an Instance: Summary 1. Lead with Working Set Requirements 2. Validate Compute is Sufficient 3. Enable EBS Optimized if Available 4. Use Provisioned IOPS EBS 5. (Confirm Cost is Acceptable)
  24. 24. 2. Deploying
  25. 25. It’s Easy. Let me show you.
  26. 26. Scaling Deployment • DevOps: Go for ‘bilities: – Reliability, Predictability, Repeatability, and Auditability • The Result is Easy Replaceability and Scalability – Build your infrastructure so it can be treated like an appliance – The impact of your decisions during planning will be significantly mitigated
  27. 27. DevOps Tools • AWS Marketplace AMIs – Preconfigured with MongoDB best practices – Do-it-yourself scaling to ReplicaSets / Shards – Helpful, but not a DevOps Solution • AWS CloudFormation – Templates for Resource Setup & Initial Configuration • Chef, Puppet, Ansible, SaltStack, & More – AWS OpsWorks, but limited by chef-solo
  28. 28. Security • Run in a VPC – Complications: Cross Region, Multiple Source Ingress • Use KeyFiles & Roles – KeyFiles: Internal authentication for cluster members – Roles allow for user-level fine grain access control • Advanced: – Keberos support in MongoDB 2.4 – SSL Support in Custom Builds & MongoDB Enterprise
  29. 29. 3. Maintaining
  30. 30. Monitoring: MongoDB Monitoring Service • Very Good, Free Holistic Monitoring – – Important: ReplLag, Page Faults, Lock % Informative: OpCounters, Connections, Queue Lengths • Includes Basic Alerting of Host Failures and Metric Thresholds • Query Profiler Details Slow Queries – db.setProfilingLevel(1)
  31. 31. Monitoring: Amazon CloudWatch • Detailed Resource Level Monitoring – Important: Queue Length, Read/Write Latencies • Versatile alerting based on Amazon Simple Notification Service (SNS)
  32. 32. Backups • Delayed Secondary – Questionable as a primary backup strategy • Dump/Restore – Impractical for larger deployments • MongoDB Service – Managed, Secure, Point in Time. Unclear suitability for larger deployments – Expensive • Snapshots – Fast, Easy, Scalable. Pay Attention to Consistency (RAID, Shards)
  33. 33. Easy Snapshot-Based Backups With Mongolly • Automatic topology detection, snapshotting, and snapshot management for EBS-backed MongoDB Databases • Easy as: $ mongolly backup •
  34. 34. Conclusions • MongoDB + AWS = • Options For All Deployment / Workload Sizes – I/O typically the focal point for optimization • Investing in a DevOps Strategy + Solution Makes It Near Effortless
  35. 35. Please give us your feedback on this presentation DAT209 As a thank you, we will select prize winners daily for completed surveys!