Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Why we chose mongodb for guardian.co.uk

46,371 views

Published on

Why we chose mongodb for guardian.co.uk

Published in: Technology

Why we chose mongodb for guardian.co.uk

  1. Why we chose mongodbfor guardian.co.uk Graham TackleyWeb Platform Team Lead, guardian.co.uk
  2. “It is not the strongest of the species thatsurvives, nor the most intelligent. It is the one that is most adaptable to change.”
  3. Early Period circa ’95The “Lash It Together” era
  4. Early Period (95, the “Lash It Together” era) Perl, CGI, apache ExperimentalManual processesBespoke software RDBMS, scripts & static files
  5. Mid Period circa ’00The “Vendor CMS” era
  6. Mid Period: 2000s (The “Vendor CMS era”) Vignette / AOLserver TCL, Apache, Oracle Platform for online publishingInitially scales well withacceleration in delivery of features
  7. Mid Period: 2000s (The “Vendor CMS era”) Surprise! Vendor’s CMSdoesn’t do what we want! Mish-mash in templates: HTML, JavaScript, TCL, SQL, PL-SQLNo model in app tier, onlyin RDBMS schema created in Oracle Designer
  8. Mid Period: 2000s (The “Vendor CMS era”)
  9. Mid Period: 2000s (The “Vendor CMS era”)
  10. Mid Period: 2000s (The “Vendor CMS era”)After a few years, very difficult to extend Database schemabecomes fixed due to dependencies in templates
  11. Mid Period: 2000s (The “Vendor CMS era”)If you can’t change the system:
  12. Modern Period circa ’05-09The “J2EE Monolithic” era
  13. Web server Web server Web server I bring you NEWS!!!App server App server App server Oracle CMS Data feeds
  14. Web server Web server Web server Modern java app I bring you NEWS!!!App server App server App server Spring / Hibernate DDD / TDD Strong Oracle in java model Database abstracted away with ORM CMS Data feeds
  15. Problems
  16. Each release involves schema upgradeSchema upgrade = downtime for journalists
  17. Complexity still increasing: 300+ tables, 10,000 lines of hibernate XML config1,000 domain objects mapped to database 70,000 lines of domain object code Very tight binding to database
  18. ORM not really masking complexity: Database has strong influence on domain model: manydomain objects made more complex mapping joins in RDBMSComplex hibernate features used, interceptors, proxies Complex caching strategy Lots of optimisations And:We still hand code complex queries in SQL!
  19. Load becoming an issueRDBMS difficult to scale
  20. Partial NoSQL circa ’09-10The “Sticking Plaster” era
  21. Introduce yet more caching to patch up load problems Text Introduction of memcached
  22. Decouple applications from database by building APIsPower APIs using alternative, more scalable technologies APIs used to scale out database reads Writes still go to RDBMs
  23. Content APIMutualised news! http://content.guardianapis.com Read API delivered using Apache Solr Hosted in EC2 Document oriented search engine Scales well for read operations
  24. Core Api Web servers Solr/API App server Solr/APIMemcached (20Gb) Solr/API rdbms Solr Solr/API Solr/API CMS Cloud, EC2
  25. Mutualised news!We’ve solved our load problem (for now) but Increased our complexity
  26. Mutualised news! We now have 3 models! RDBMS tables Java Objects JSON API
  27. Mutualised news!
  28. Mutualised news!
  29. Mutualised news!
  30. MutualisedAPI is very simple JSON news!Multiple domain concepts expressed in single document Can be designed in forwardly extensible wayWhat if the JSON API was our primary model?
  31. Full NoSQL in developmentThe “It’s the future!” era
  32. Database selection Simple keystore. Too simple? Huge scalability. Do we need it? Schema design difficult. Simple to use, can execute similar queries to RDBMs
  33. MongoDB Mutualised news! database Document oriented Stores parsed JSON documents Can express complex queries Can be flexible about consistencyMalleable schema: can easily change at runtime Can work at both large & small scales
  34. Flexible SchemaMutualised news!
  35. Flexible SchemaMutualised news!
  36. Flexible SchemaMutualised news!Can easily represent different classes of tag as documents Both documents can be inserted into same collection Far simpler than equivalent hibernate mapped subclass configuration
  37. Flexible Schema Simple to query:Mutualised news!
  38. Flexible Schema Simple to query:Mutualised news! Query operators: $ne, $nin, $all, $exists, $gt, $lt, $gte ...
  39. Modifying the schemaMutualised news!
  40. Modifying the schemaMutualised news!
  41. Modifying the schemaMutualised news!
  42. The first project: IdentityCurrent login/registration system still in TCL/PL-SQL 3M+ users in relational database Very complex schema + PL-SQL New system required Can we migrate from Oracle to NoSql?
  43. Build API that can support both backends Registration app guardian.co.uk API This bit is hard! Oracle
  44. Build API that can support both backends Registration app guardian.co.uk API MongoDB Oracle
  45. Migrate using API & decommision Registration app guardian.co.uk API MongoDB
  46. Add new stuff! Registration app guardian.co.uk APIMongoDB Solr? Redis?
  47. MongoDBSimple, flexible schema with similar query & indexing to RDBMS Great at small or large scale Easy for developers to get going Commercial support available (10Gen) One day may power all of guardian.co.ukNo transactions / joins: developers must cater for thisProduces a net reduction in lines of code / complexity
  48. Shameless plugs http://content.guardianapis.com We’re hiring: http://www.gnmcareers.co.uk ref JS323graham.tackley@guardian.co.uk - @tackers

×