Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

MongoDB as a fast and queryable cache

17,179 views

Published on

Martin Tepper of Travel IQ presents at Mongo Berlin

Published in: Technology

MongoDB as a fast and queryable cache

  1. 1. MongoDB Conference Berlin 2011 MongoDB as a queryable cache
  2. 2. About me • Martin Tepper • Lead Developer at Travel IQ • http://monogreen.deMongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
  3. 3. Contents • About Travel IQ • The problem • The solution • The headachesMongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
  4. 4. About Travel IQ • Meta Search Engine for Flights and Hotels • 9 Hotel Providers • 21 Flight Providers • ~ 6000 searches per day • ~ 64k provider queries per dayMongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
  5. 5. About Travel IQ • Real-Time Aggregation • Ruby/Rails based • API-DrivenMongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
  6. 6. Quick aside • Ruby: OO script language • Rails: MVC Web application framework • ActiveRecord: ORM frameworkMongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
  7. 7. The Problem
  8. 8. Basic ArchitectureMongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
  9. 9. Basic ArchitectureMongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
  10. 10. Strongly Normalized • Very organized • Reuse of models • Saves disk space • But …MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
  11. 11. sql = <<-SQLSELECT MIN(outerei.id) FROM( SELECT OBJ1.starts_at AS OBJ1_starts_at, OBJ1.ends_at AS OBJ1_ends_at, OBJ1.origin_id AS OBJ1_origin_id, OBJ1.destination_id AS OBJ1_destination_id, MIN(P1.price) AS the_price FROM packages P1 LEFT JOIN journeys OBJ1 ON (P1.outbound_journey_id = OBJ1.id) LEFT JOIN results R1 ON (R1.package_id = P1.id) LEFT JOIN packagings PA1a ON (PA1a.package_id = P1.id AND PA1a.position = 1) LEFT JOIN offers O1a ON (PA1a.offer_id = O1a.id) WHERE R1.search_id IN (#{search_id}) AND R1.search_type = FlightSearch AND O1a.expires_at > #{expiring_after} GROUP BY OBJ1.starts_at, OBJ1.ends_at, OBJ1.origin_id, OBJ1.destination_id ) AS innerei JOIN ( SELECT P2.id, OBJ2.starts_at AS OBJ2_starts_at, OBJ2.ends_at AS OBJ2_ends_at, OBJ2.origin_id AS OBJ2_origin_id, OBJ2.destination_id AS OBJ2_destination_id, P2.price FROM packages P2 LEFT JOIN results R2 ON (R2.package_id = P2.id) LEFT JOIN journeys OBJ2 ON (P2.outbound_journey_id = OBJ2.id) LEFT JOIN packagings PA2a ON (PA2a.package_id = P2.id AND PA2a.position = 1) LEFT JOIN offers O2a ON (PA2a.offer_id = O2a.id) WHERE R2.search_id IN (#{search_id})
  12. 12. The problem • Strongly normalized database • Complex query requirements • Lots of joins • ActiveRecord and rendering overhead • Slow API callsMongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
  13. 13. The Solution
  14. 14. Solution 1: Schema • Redo the schema • Migration hard • Some relationships hard to denormalizeMongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
  15. 15. Solution 2: Memcached • Memcached • Very fast response times • But no real queries → Horrible abstraction layerMongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
  16. 16. Memcached response times over time 10,0response time of api call in seconds 8,0 6,0 4,0 2,0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 seconds after search start MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
  17. 17. Solution 3: MongoDB • Document-oriented – less render overhead • Grouping of offers • Proper queries and counts • Still quite fastMongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
  18. 18. How we use MongoDBMongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
  19. 19. How we use MongoDB • Replica set with 2 nodes and 2 arbiters • Two servers with 16 cores / 64GB RAM → run MySQL and MongoDB • ~ 600 writes/s and reads/s normal load • ~ 6000 writes/s doableMongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
  20. 20. MongoDB response times over time 10,0response time of api call in seconds 8,0 6,0 4,0 2,0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 seconds after search start MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
  21. 21. The Headaches
  22. 22. Problems with MongoDB • Segmentation Faults • Only in productionMongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
  23. 23. Problems with MongoDB • Segmentation Faults • Only in production → Replica Set helped a lot → Fixed with nightly buildMongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
  24. 24. Problems with MongoDB • Write performance during peak load • Lots of small concurrent writesMongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
  25. 25. Problems with MongoDB • Write performance during peak load • Lots of small concurrent writes → Solved by bundling writesMongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
  26. 26. Problems with MongoDB • Hotel data too big to denormalize • In separate collectionMongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
  27. 27. Problems with MongoDB • Hotel data too big to denormalize • In separate collection → Solved with app-level “join“MongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
  28. 28. Problems with MongoDB • Data consistency • Typical caching problem • Updates to MySQL also in MongoDBMongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
  29. 29. Problems with MongoDB • Data consistency • Typical caching problem • Updates to MySQL also in MongoDB → Solved with callbacks in ActiveRecordMongoDB as a queryable cache · Martin Tepper, monogreen.de · 2011-03-25
  30. 30. Thank you

×