• Like

A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?

  • 947 views
Uploaded on

 

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
947
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
19
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Why Wordnik went Non-Relational Tony Tam @fehguy
  • 2. What this Talk is About• 5 Key reasons why Wordnik migrated into a Non-Relational database• Process for selection, migration• Optimizations and tips from living survivors of the battle field
  • 3. Why Should You Care?• MongoDB user for 2 years• Lessons learned, analysis, benefits from process• We migrated from MySQL to MongoDB with no downtime• We have interesting/challenging data needs, likely relevant to you
  • 4. More on Wordnik• World’s fastest updating English dictionary • Based on input of text up to 8k words/second • Word Graph as basis to our analysis • Synchronous & asynchronous processing• 10’s of Billions of documents in NR storage• 20M daily REST API calls, billions served • Powered by Swagger OSS API framework Powered API swagger.wordnik.com
  • 5. Architectural History• 2008: Wordnik was born as a LAMP AWS EC2 stack• 2009: Introduced public REST API, powered wordnik.com, partner APIs• 2009: drank NoSQL cool-aid• 2010: Scala• 2011: Micro SOA
  • 6. Non-relational by Necessity• Moved to NR because of ―4S‖ • Speed • Stability • Scaling • Simplicity• But… • MySQL can go a LONG way • Takes right team, right reasons (+ patience) • NR offerings simply too compelling to focus on scaling MySQL
  • 7. Wordnik’s 5 Whys for NoSQL
  • 8. Why #1: Speed bumps with MySQL• Inserting data fast (50k recs/second) caused MySQL mayhem • Maintaining indexes largely to blame • Operations for consistency unnecessary but "cannot be turned off‖• Devised twisted schemes to avoid client blocking • Aka the ―master/slave tango‖
  • 9. Why #2: Retrieval Complexity• Objects typically mapped to tables • Object Hierarchy always => inner + outer joins• Lots of static data, so why join? • “Noun” is not getting renamed in my code’s lifetime! • Logic like this is probably in application logic• Since storage is cheap • I’ll choose speed
  • 10. Why #2: Retrieval Complexity One definition = 10+ joins 50 requests per second!
  • 11. Why #2: Retrieval Complexity• Embed objects in rows ―sort of works‖ • Filtering gets really nasty • Native XML in MySQL? • If a full table-scan is OK…• OK, then cache it! • Layers of caching introduced layers of complexity • Stale data/corruption • Object versionitis • Cache stampedes
  • 12. Why #3: Object Modeling• Object models being compromised for sake of persistence • This is backwards! • Extra abstraction for the wrong reason• OK, then performance suffers • In-application joins across objects • ―Who ran the fetch all query against production?!‖ –any sysadmin• ―My zillionth ORM layer that only I understand‖ (and can maintain)
  • 13. Why #4: Scaling• Needed "cloud friendly storage" • Easy up, easy down! • Startup: Sync your data, and announce to clients when ready for business • Shutdown: Announce your departure and leave• Adding MySQL instances was a dance • Snapshot + bin files mysql> change master to MASTER_HOST=db1, MASTER_USER=xxx, MASTER_ PASSWORD=xxx, MASTER_LOG_FILE=master- relay.000431, MASTER_LOG_POS=1035435402;
  • 14. Why #4: Scaling• What about those VMs? • So convenient! But… they kind of suck • Can the database succeed on a VM?• VM Performance: • Memory, CPU or I/O—Pick only one • Can your database really reduce CPU or disk I/O with lots of RAM?
  • 15. Why #5: Big Picture• BI tools use relational constraints for discovery • Is this the right reason for them? • Can we work around this? • Let’s have a BI tool revolution, too!• True service architecture makes relational constraints impractical/impossible• Distributed sharding makes relational constraints impractical/impossible
  • 16. Why #5: Big Picture• Is your app smarter than your database? • The logic line is probably blurry!• What does count(*) really mean when you add 5k records/sec? • Maybe eventual consistency is not so bad…• 2PC? Do some reading and decide!http://eaipatterns.com/docs/IEEE_Software_Design_2PC.pdf
  • 17. Ok, I’m in!• I thought deciding was easy!? • Many quickly maturing products • Divergent features tackle different needs• Wordnik spent 8 weeks researching and testing NoSQL solutions • This is a long time! (for a startup) • Wrote ODM classes and migrated our data• Surprise! There were surprises • Be prepared to compromise
  • 18. Choice Made, Now What?• We went with MongoDB *** • Fastest to implement • Most reliable • Best community• Why? • Why #1: Fast loading/retrieval • Why #2: Fast ODM (50 tps => 1000 tps!) • Why #3: Document Models === Object models • Why #4: MMF => Kernel-managed memory + RS • Why #5: It’s 2011, is there no progress?
  • 19. More on Why MongoDB• Testing, testing, testing • Used our migration tools to load test • Read from MySQL, write to MongoDB • We loaded 5+ billion documents, many times over• In the end, one server could… • Insert 100k records/sec sustained • Read 250k records/sec sustained • Support concurrent loading/reading
  • 20. Migration & Testing• Iterated ODM mapping multiple times • Some issues • Type Safety cur.next.get("iWasAnIntOnce").asInstanceOf[Long] • Dates as Strings obj.put("a_date", "2011-12-31") != obj.put("a_date", new Date("2011-12-31")) • Storage Size obj.put("very_long_field_name", true) >> obj.put("vsfn", true)
  • 21. Migration & Testing• Expect data model iterations • Wordnik migrated table to Mongo collection "as-is‖ • Easier to migrate, test • _id field used same MySQL PK • Auto Increment? • Used MySQL to ―check-out‖ sequences • One row per mongo collection • Run out of sequences => get more • Need exclusive locks here!
  • 22. Migration & Testing• Sequence generator in-process SequenceGenerator.checkout("doc_metadata,100")• Sequence generator as web service • Centralized UID management
  • 23. Migration & Testing• Expect data access pattern iterations • So much more flexibility! • Reach into objects > db.dictionary_entry.find({"hdr.sr":"cmu"}) • Access to a whole object tree at query time • Overwrite a whole object at once… when desired • Not always! This clobbers the whole record > db.foo.save({_id:18727353,foo:"bar"}) • Update a single field: > db.foo.update({_id:18727353},{$set:{foo:"bar"}})
  • 24. Flip the Switch• Migrate production with zero downtime • We temporarily halted loading data • Added a switch to flip between MySQL/MongoDB • Instrument, monitor, flip it, analyze, flip back• Profiling your code is key • What is slow? • Build this in your app from day 1
  • 25. Flip the Switch
  • 26. Flip the Switch• Storage selected at runtime val h = shouldUseMongoDb match { case true => new MongoDbSentenceDAO case _ => new MySQLDbSentenceDAO } h.find(...)• Hot-swappable storage via configuration • It worked!
  • 27. Then What?• Watch our deployment, many iterations to mapping layer • Settled on in-house, type-safe mapper https://github.com/fehguy/mongodb-benchmark-tools• Some gotchas (of course) • Locking issues on long-running updates (more in a minute)• We want more of this! • Migrated shared files to Mongo GridFS • Easy-IT
  • 28. Performance + Optimization• Loading data is fast! • Fixed collection padding, similarly-sized records • Tail of collection is always in memory • Append faster than MySQL in every case tested• But... random access started getting slow • Indexes in RAM? Yes • Data in RAM? No, > 2TB per server • Limited by disk I/O /seek performance • EC2 + EBS for storage?
  • 29. Performance + Optimization• Moved to physical data center • DAS & 72GB RAM => great uncached performance• Good move? Depends on use case • If ―access anything anytime‖, not many options • You want to support this?
  • 30. Performance + Optimization• Inserts are fast, how about updates? • Well… update => find object, update it, save • Lock acquired at ―find‖, released after ―save‖ • If hitting disk, lock time could be large• Easy answer, pre-fetch on update • Oh, and NEVER do ―update all records‖ against a large collection
  • 31. Performance + Optimization• Indexes • Cant always keep index in ram. MMF "does its thing" • Right-balanced b-tree keeps necessary index hot • Indexes hit disk => mute your pager 1 7 1 2 5 7
  • 32. More Mongo, Please! • We modeled our word graph in mongo• 50M Nodes• 80M Edges• 80 S edge fetch
  • 33. More Mongo, Please!• Analytics rolled-up from aggregation jobs • Send to Hadoop, load to mongo for fast access
  • 34. What’s next• Liberate our models • stop worrying about how to store them (for the most part)• New features almost always NR• Some MySQL left • Less on each release
  • 35. Questions?• See more about Wordnik APIs http://developer.wordnik.com• Migrating from MySQL to MongoDBhttp://www.slideshare.net/fehguy/migrating-from-mysql-to-mongodb-at-wordnik• Maintaining your MongoDB Installation http://www.slideshare.net/fehguy/mongo-sv-tony-tam• Swagger API Framework http://swagger.wordnik.com• Mapping Benchmark https://github.com/fehguy/mongodb-benchmark-tools• Wordnik OSS Tools https://github.com/wordnik/wordnik-oss