Your SlideShare is downloading. ×
Running MongoDB in the Cloud
Running MongoDB in the Cloud
Running MongoDB in the Cloud
Running MongoDB in the Cloud
Running MongoDB in the Cloud
Running MongoDB in the Cloud
Running MongoDB in the Cloud
Running MongoDB in the Cloud
Running MongoDB in the Cloud
Running MongoDB in the Cloud
Running MongoDB in the Cloud
Running MongoDB in the Cloud
Running MongoDB in the Cloud
Running MongoDB in the Cloud
Running MongoDB in the Cloud
Running MongoDB in the Cloud
Running MongoDB in the Cloud
Running MongoDB in the Cloud
Running MongoDB in the Cloud
Running MongoDB in the Cloud
Running MongoDB in the Cloud
Running MongoDB in the Cloud
Running MongoDB in the Cloud
Running MongoDB in the Cloud
Running MongoDB in the Cloud
Running MongoDB in the Cloud
Running MongoDB in the Cloud
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Running MongoDB in the Cloud

8,827

Published on

A talk about how Wordnik migrated from EC2 to physical servers and back again, much due to the cloud-friendliness of MongoDB

A talk about how Wordnik migrated from EC2 to physical servers and back again, much due to the cloud-friendliness of MongoDB

Published in: Technology, Business
0 Comments
10 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
8,827
On Slideshare
0
From Embeds
0
Number of Embeds
9
Actions
Shares
0
Downloads
149
Comments
0
Likes
10
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Running MongoDB in the Cloud Tony Tam @fehguy
  • 2. What this Talk is AboutWordnik left the cloud and came back • What?!? • Why we left • Decisions • Why we came back (and what we did differently)
  • 3. Who is Wordnik?• World’s fastest updating English dictionary • Based on input of text at ~8k words/second • Word Graph as basis to our analysis • Synchronous & asynchronous processing• 10’s of Billions of documents in NR storage• Concept & Meaning Discovery Engine• > 20M daily REST API calls, billions served
  • 4. So Why the Detour?• Architectural Choices• Business Choices• Feedback, tooling, infrastructure• Learning• Changes in use case• Progress!
  • 5. Architecture History• EC2-based LAMP Stack • POC (and seed funding) • A manageable corpus < 1M records• REST API • Web + public • MySQL in master/slave • ~1B documents • Operational nightmare
  • 6. Architecture History• MongoDB • First-order MySQL issues solved • But it got slow…• Real Servers to the rescue! • Faster, bigger disks• MongoDB for Corpus, Structured Data • Faster Reads + Writes! • More metal (72GB RAM) • More cores • “cold” query from 400ms to < 100
  • 7. Why Change?Easy!• Can’t beat metal…except • Quick expansion • Batch jobs/experiments • Add a datacenter • Full cluster migration • The bill for unused capacity
  • 8. Architectural Mindshift1. Anything can die, anytime2. Centralized, redundant state (see point 1)3. Server performance is *different* • CPU, I/O, Memory—choose one • Smart design makes it work!
  • 9. Architectural Mindshift• Your software will need to change! • So will the components you rely on
  • 10. Your Infrastructure Cloud Hero• Deploying Servers • Going to need a lot!• Configuration• Updates to your software What about Data?
  • 11. Let’s make this Work!• MySQL Master Slave • Take a snapshot (yes, this will block) • Keep your binlogs!change master to MASTER_HOST=app1,MASTER_USER=XXXX, MASTER_PASSWORD=XXXX,MASTER_LOG_FILE=app1-relay.0038774,MASTER_LOG_POS=6754205951;
  • 12. Let’s make this Work!But…• Your master is down! • Quick, promote a slave! • Point the other slaves to the new master• As for the clients… “Well, we never really tried that…”
  • 13. Better with Mongo• Easy up, easy down! • Startup: Sync your data, and announce to clients when ready for business • Shutdown: Announce your departure and leave• Replica sets rs.add("db4.wordnik.com:27017"); rs.remove("db1.wordnik.com:27017");
  • 14. Better with Mongo
  • 15. But what about Performance?• Software Design • It’s slow! (What is *it*?) • Profile everythingimport com.wordnik.util.perf._...def findUser(id:Long): User = { Profile("UserDao::findUserById", dao.findUserById(id))} http://github.com/wordnik/wordnik-oss
  • 16. But what about Performance?
  • 17. But what about Performance?• “It’s the database!” • What is it?• Mapping layer • Mysql (12+ joins) => 50 records/sec • Mongo JSON  POJO => 1000 records/sec • Mongo DBO  POJO => 35,000 records/sec• How do you know? Profile it!
  • 18. It’s Still Slow!• It’s the index! • How do you know? • AHHHHH
  • 19. It’s Still Slow!• Balance your B-Tree • Cant always keep index in ram. MMF "does its thing" • Right-balanced b-tree keeps necessary index hot • If you hit indexes on disk, mute your pager 1 7 1 2 5 7
  • 20. But it’s Still Slow!• Look at your Schema design • Design to limit index size/number • _id is your friend—make it meaningful • Record size consistency • Hierarchal Data beware! • Split documents even in same collection!db.posts.find({_id:/^tony_posts_/}){_id:"tony_posts_1”, posts:[...]}{_id:"tony_posts_2”, posts:[...]} YOUR{_id:"tony_posts_3”, posts:[...]} app knows best
  • 21. Really, it’s STILL slow!• Your monolithic app/DB won’t scale same on VMs• Specialize! • Wordnik uses SOA Powered API swagger.wordnik.com • Data tiers follow service types • Smaller *everything*
  • 22. Really, it’s STILL slow!• Your monolithic app/DB won’t scale same on VMs• Specialize! • Wordnik uses SOA A contract for your swagger.wordnik.com Powered API clients • Data tiers follow service types • Smaller *everything*
  • 23. Be the Boss of your Data• Your app *should* be smarter than your DB • Lots of users? • Lots of blog posts? • Lots of images? • Shard? On what?• Data dimensionality • Keep active data hot • Don’t try to boil the ocean
  • 24. Cloud Computing + Mongo• It can work extremely well • No “Save as Cloud!” menu item• Shifting constraints • Optimize for RAM on VM • Virtual disk => virtual performance• Be “Deployable” • Mongo Replica Sets are made for this
  • 25. Cloud Computing + Mongo• System Durability • Design your software for abuse • Your old design doesn’t apply • Add APM hooks, now!• Dissect your app • Build to micro services with dedicated MongoDB clusters• Deployment Infrastructure • Don’t wait until it’s too late
  • 26. See More• See more about Wordnik APIs http://developer.wordnik.com• Migrating from MySQL to MongoDBhttp://www.slideshare.net/fehguy/migrating-from-mysql-to-mongodb-at-wordnik• Maintaining your MongoDB Installation http://www.slideshare.net/fehguy/mongo-sv-tony-tam• Swagger API Framework http://swagger.wordnik.com• Mapping Benchmark https://github.com/fehguy/mongodb-benchmark-tools• Wordnik OSS Tools https://github.com/wordnik/wordnik-oss
  • 27. Questions?

×