Letters from the Trenches: Lessons Learned Taking MongoDB to Production
Upcoming SlideShare
Loading in...5
×
 

Letters from the Trenches: Lessons Learned Taking MongoDB to Production

on

  • 1,547 views

eHarmony moved one family of business-critical back-end applications to MongoDB several months ago. In this presentation, I discuss some of the important lessons we learned along the way about how to ...

eHarmony moved one family of business-critical back-end applications to MongoDB several months ago. In this presentation, I discuss some of the important lessons we learned along the way about how to provision, scale, manage, and troubleshoot MongoDB.

Statistics

Views

Total Views
1,547
Views on SlideShare
1,461
Embed Views
86

Actions

Likes
1
Downloads
4
Comments
0

2 Embeds 86

https://twitter.com 85
http://gazeta.yandex.ru 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Specifically, we’ll be talking about 5 lessons.It should take about 30 minutes.
  • At some point, you’ll realize the data in your cluster isn’t what and/or how you need. You’ll need to reconstruct it.In first two cases, you could dump and reload a single cluster.What about production changes in the mean time?
  • Idea is for the breakdown of data across shards to reflect the same natural divisions of data you’re likely to query against.

Letters from the Trenches: Lessons Learned Taking MongoDB to Production Letters from the Trenches: Lessons Learned Taking MongoDB to Production Presentation Transcript

  • Letters from the Trenches: Lessons Learned Taking MongoDB to Production October 17, 2013 Rick Warren rick.warren@eharmony.com
  • Traditional Internet Dating Service Unidirectional User-Defined Criteria
  • eHarmony Matching Bidirectional User-Defined Criteria
  • eHarmony Matching: 3 Parts 1. Bidirectional User-Defined Criteria 2. Research-Based Compatibility Models 3. Machine-Learned Affinity Models Photo Credits Magnifying glass: andercismo @ http://www.flickr.com/photos/andercismo/ Machine learning: University of Maryland Press Releases @ http://www.flickr.com/photos/umdnews/
  • Application: Find Potential Matches As fast as possible: 1. Find people who meet each other’s preferences 1. Bidirectional User-Defined Criteria 2. Discard combos that violate Compatibility Models
  • Application: Find Potential Matches • User attributes in MongoDB – Replicated – Sharded • Data access pattern: 1. Bidirectional User-Defined Criteria – Read-heavy – Complex queries • Java application
  • Application: Find Potential Matches • In full production > 6 mos – Following several mos limited production – Following several mos intensive dev+testing • No production outages • MongoDB no longer the thing we worry about most • User attributes in MongoDB – Replicated – Sharded • Data access pattern: – Read-heavy – Complex queries • Java application
  • Lesson: Provision for Success  Fit all data & indexes in memory – MongoDB storage implemented using mem-mapped files – Beware under-provisioned VMs  Minimize field names to keep data as small as possible – “Schema-less records” == “schema repeated millions of times” – Morphia Java library can help with mapping
  • Lesson: Provision for Success Scale write ops & data volume by adding shards Scale read ops by adding secondaries Shard / RS Shard / RS Primary Primary Secondary Secondary Secondary Secondary … … …
  • Lesson: Be Ready to Tinker • Many processes:  Use Puppet, Chef, or similar – mongod on each node, primary or secondary – Helps with config files, command-line arguments – 2 MMS agents – Insufficient for adding secondaries, configuring indexes, etc. – Plus, if sharding: • mongos for each app instance • 3 config servers • …Each configured separately & differently – Configuration file – Manual commands to set up • Less likely to have DBA support – …and relational Best Practices may not transfer  If scripting, use real client driver, not mongo shell – Doesn’t handle output or errors consistently – Can’t wait in JavaScript  Train your DB/Ops team(s) – And expect to do more yourself
  • Lesson: Shadow Mode Is Your Friend  Test with real production data, conditions, and queries  Measure everything (MMS is a good start, but insufficient) Real Application Real Events & Requests “Shadow” Application X  Kill mongod instances to verify resiliency Primary school enrollment, Armenia: http://data.worldbank.org/country/armenia
  • Lesson: Be Ready to Restore Your Data • Schemas will change  Maintain 2nd copy in another format – Backing source of truth? • Shard key(s) will change – More on this later… • You’ll experience MongoDB bugs – Backup in standard format? – Second cluster with different version of MongoDB?  Increment DB name with each reload  Automate reload process, and use it Image credit: http://tutorialphotoshopcs-putradom.blogspot.com/2012/11/create-dramatic-meteor-and-burning-city.html
  • Lesson: Pick a Good Shard Key 1. Distribute Data Volume Evenly – This is what auto-balancing does for you. 2. Multiply Query Performance – Isolate queries to 1 shard to multiply read capacity by # of shards. 3. Distribute Workload Evenly – Conflicts with above!
  • Lesson: Pick a Good Shard Key Shard 1 Shard 2 mongos 1. Distribute Data Volume Evenly – This is what auto-balancing does for you. 2. Multiply Query Performance – Isolate queries to 1 shard to multiply read capacity by # of shards. 3. Distribute Workload Evenly – Conflicts with above! Jessica Rabbit: http://disney.wikia.com/wiki/Jessica_Rabbit Steve Urkel: http://celebratingtvandfilmgeeks.wordpress.com/2010/04/25/steve-urkel-the-
  • Lesson: Pick a Good Shard Key DO These Things BEWARE These Things  Use fields appearing in every query • Include serial numbers (or similar)  Choose combo that finely partitions data • Hash fields when reads might be a problem  Measure relative load across shards • Mutable fields in shard key—remove and add – Consider adding secondaries to loaded shard(s) ONLY
  • Summary 1. Provision for Success 2. Be Ready to Tinker 3. Shadow Mode Is Your Friend 4. Be Ready to Restore Your Data 5. Pick a Good Shard Key
  • We’re Hiring http://www.eharmony.com/about/careers rick.warren@eharmony.com