SlideShare a Scribd company logo
1 of 63
Why I chose
   mongodb
for guardian.co.uk
              Mat Wall
Lead Software Architect, guardian.co.uk
“It is not the strongest of the species that
survives, nor the most intelligent. It is the one
       that is most adaptable to change.”
Early Period

         circa ’95

The “Lash It Together” era
Early Period (95, the “Lash It Together” era)


 Perl, CGI, apache

  Experimental
Manual processes
Bespoke software

 RDBMS, scripts
  & static files
Mid Period

      circa ’00

The “Vendor CMS” era
Mid Period: 2000s (The “Vendor CMS era”)


 Vignette / AOLserver
 TCL, Apache, Oracle

  Platform for online
       publishing

Initially scales well with
acceleration in delivery
        of features
Mid Period: 2000s (The “Vendor CMS era”)


 Surprise! Vendor’s CMS
doesn’t do what we want!

 Mish-mash in templates:
 HTML, JavaScript, TCL,
      SQL, PL-SQL

No model in app tier, only
in RDBMS schema created
    in Oracle Designer
Mid Period: 2000s (The “Vendor CMS era”)
Mid Period: 2000s (The “Vendor CMS era”)
Mid Period: 2000s (The “Vendor CMS era”)



After a few years, very
   difficult to extend

  Database schema
becomes fixed due to
  dependencies in
     templates
Mid Period: 2000s (The “Vendor CMS era”)




If you can’t change the
        system:
Modern Period

       circa ’05-09

The “J2EE Monolithic” era
Web server       Web server         Web server



  I bring you NEWS!!!
App server      App server          App server




                 Oracle


         CMS                  Data feeds
Web server         Web server         Web server
              Modern java app
  I bring you NEWS!!!
App server      App server            App server
           Spring / Hibernate

                 DDD / TDD

             Strong Oracle in java
                    model

 Database abstracted away with ORM
         CMS                    Data feeds
Problems
Each release involves schema upgrade

Schema upgrade = downtime for journalists
Complexity still increasing:

               300+ tables,
  10,000 lines of hibernate XML config
1,000 domain objects mapped to database
   70,000 lines of domain object code
      Very tight binding to database
ORM not really masking complexity:

 Database has strong influence on domain model: many
domain objects made more complex mapping joins in
                      RDBMS

Complex hibernate features used, interceptors, proxies

               Complex caching strategy
                Lots of optimisations

                    And:
We still hand code complex queries in SQL!
Load becoming an issue

RDBMS difficult to scale
Partial NoSQL

       circa ’09-10

The “Sticking Plaster” era
Introduce yet more caching to patch up load problems

 Decouple applications from database by building APIs

Power APIs using alternative, more scalable technologies

         APIs used to scale out database reads

               Writes still go to RDBMs
Core
                             Api
   Web servers

                            Solr/API
    App server
                            Solr/API
Memcached (20Gb)
                            Solr/API

     rdbms         Solr
                            Solr/API

      M/Q                   Solr/API

     CMS                  Cloud, EC2
Content API
Mutualised news! Apache Solr
 Read API delivered using

            Hosted in EC2

   Document oriented search engine

  Loose schema: records, fields, facets

    Scales well for read operations
Related content from Solr




           Introduction of memcached
Mutualised news!
We’ve solved our load problem (for now)

                 but

       Increased our complexity
Mutualised news!
      We now have 3 models!

           RDBMS tables

            Java Objects

             JSON API
Mutualised news!
Mutualised news!
Mutualised news!
MutualisedAPI is very simple
           JSON news!

Multiple domain concepts expressed in single document

     Can be designed in forwardly extensible way

What if the JSON API was our primary model?
Full NoSQL

    in development

The “It’s the future!” era
The first project: Identity

Current login/registration system still in TCL/PL-SQL

          3M+ users in relational database

          Very complex schema + PL-SQL

               New system required

      Can we migrate from Oracle to NoSql?
Database selection


      Simple keystore. Too simple?



     Huge scalability. Do we need it?
        Schema design difficult.


    Simple to use, can execute similar
           queries to RDBMs
MongoDB

 Mutualised news! database
     Document oriented
       Stores parsed JSON documents

        Can express complex queries

      Can be flexible about consistency

Malleable schema: can easily change at runtime

    Can work at both large & small scales
MongoDB concepts


Mutualised news!
    RDBMS           MongoDB
     Table          Collection

      Row        JSON Document

     Index            Index

      Join      Embedding & Linking

    Partition         Shard
Flexible Schema


Mutualised news!
Flexible Schema


Mutualised news!
Flexible Schema


Mutualised news!
Can easily represent different classes of tag as
                 documents

    Both documents can be inserted into
             same collection

    Far simpler than equivalent hibernate
       mapped subclass configuration
Flexible Schema

        Simple to query:
Mutualised news!
Flexible Schema

              Simple to query:
Mutualised news!
            Query operators:
  $ne, $nin, $all, $exists, $gt, $lt, $gte ...
Modifying the schema


Mutualised news!
Modifying the schema


Mutualised news!
Modifying the schema


Mutualised news!
Schema upgrades


     Mutualised news!
   Schema can be upgraded simply by upgrading the
                 application version

Application must deal with differing document versions

           Can become complex over time
Schema upgrades


      Mutualised news! by:
           This can be mitigated

        Adding a “version” key to each document

Updating the version each time the application modifies a
                       document

Using MapReduce capability to forcibly migrate documents
            from older versions if required
Mongodb architecture




                     mongod


                  Single node
Durability only possible in upcoming 1.8 release
    (databse fsync from buffer every min)
Mongodb architecture
         master                      replicas
          mongod                      mongod

                                      mongod

    Replica set                       mongod

                                      mongod



 Can choose to read &
                             Can choose to run reads
write from master for full
                             on slaves to scale reads
       consistency
Mongodb architecture
        master                          replicas
        Durability achieved (<1.8) via replication
         mongod                          mongod
         Reads can be scaled out onto replicas
                                       mongod
                (eventual consistency)
    Replica set                        mongod
                 All writes to master
                                         mongod

    If master fails, new master nominated by election
 Can choose to read &         Can choose to accept dirty
write from drivers handle most cluster complexity scale
        DB master for full     reads from slaves to
       consistency                       reads
Mongodb architecture


Aggregator                         mongos


consistent     shard      shard             shard     shard
 (master)

               replica   replica            replica   replica
inconsistent
  (replica)    replica   replica            replica   replica

               replica   replica            replica   replica
Mongodb architecture

                Writes scaled by sharding
Aggregator                            mongos
                Shards populated by ranges
consistent    shard      shard     shard                 shard
 (master) mongos queries appropriate shard(s)

              Shards automatically balanced
               replica   replica     replica             replica
inconsistent
  (replica)     replica    replica    replica    replica
         Developers (essentially) unaware of shards
                replica     replica            replica   replica
Mongodb durability


Relies (pre 1.8) on replication for durability
1.8 features optional journaling & redo logs

 Database users need to be cluster aware,
         each query can specify:

  No error checking / write confirmation
       Write confirmed on master
   Write replicated to N slave servers
Old Idenity system
 Hundreds of tables & stored procedures
Mutualised news!

       New Identity model

     User                     List
       Fields      Text
       Dates     Date/Time
      Statuses    Boolean
Very simple domain objects




   Simple, flexible objects
    No hibernate session
Very simple domain objects




Flexible schema embraced in domain object design
Very simple domain objects




Using casbah scala drivers = significant reduction in LOC
                 vs SQL implementation
Build API that can support both backends


    Registration app         guardian.co.uk




                       API



         MongoDB                       Oracle
Build API that can support both backends


    Registration app         guardian.co.uk




                       API             This bit is hard!


         MongoDB                       Oracle
Migrate using API & decommision


 Registration app         guardian.co.uk




                    API



      MongoDB
Add new stuff!


    Registration app           guardian.co.uk




                         API



MongoDB                Solr?                    Redis?
MongoDB

Simple, flexible schema with similar query & indexing to
                        RDBMS
              Great at small or large scale
            Easy for developers to get going
         Commercial support available (10Gen)
        One day may power all of guardian.co.uk

No transactions / joins: developers must cater for this

Produces a net reduction in lines of code / complexity
Shameless plug




        We’re hiring:

http://www.careersatgnl.co.uk

More Related Content

What's hot

Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013mumrah
 
MongoDB World 2018: Bumps and Breezes: Our Journey from RDBMS to MongoDB
MongoDB World 2018: Bumps and Breezes: Our Journey from RDBMS to MongoDBMongoDB World 2018: Bumps and Breezes: Our Journey from RDBMS to MongoDB
MongoDB World 2018: Bumps and Breezes: Our Journey from RDBMS to MongoDBMongoDB
 
Amazon Cassandra Basics & Guidelines for AWS/EC2/VPC/EBS
Amazon Cassandra Basics & Guidelines for AWS/EC2/VPC/EBSAmazon Cassandra Basics & Guidelines for AWS/EC2/VPC/EBS
Amazon Cassandra Basics & Guidelines for AWS/EC2/VPC/EBSJean-Paul Azar
 
Overview of PaaS: Java experience
Overview of PaaS: Java experienceOverview of PaaS: Java experience
Overview of PaaS: Java experienceIgor Anishchenko
 
Brief introduction to Kafka Streaming Platform
Brief introduction to Kafka Streaming PlatformBrief introduction to Kafka Streaming Platform
Brief introduction to Kafka Streaming PlatformJean-Paul Azar
 
.Net overviewrajnish
.Net overviewrajnish.Net overviewrajnish
.Net overviewrajnishRajnish Kalla
 
Www Kitebird Com Articles Pydbapi Html Toc 1
Www Kitebird Com Articles Pydbapi Html Toc 1Www Kitebird Com Articles Pydbapi Html Toc 1
Www Kitebird Com Articles Pydbapi Html Toc 1AkramWaseem
 

What's hot (7)

Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
 
MongoDB World 2018: Bumps and Breezes: Our Journey from RDBMS to MongoDB
MongoDB World 2018: Bumps and Breezes: Our Journey from RDBMS to MongoDBMongoDB World 2018: Bumps and Breezes: Our Journey from RDBMS to MongoDB
MongoDB World 2018: Bumps and Breezes: Our Journey from RDBMS to MongoDB
 
Amazon Cassandra Basics & Guidelines for AWS/EC2/VPC/EBS
Amazon Cassandra Basics & Guidelines for AWS/EC2/VPC/EBSAmazon Cassandra Basics & Guidelines for AWS/EC2/VPC/EBS
Amazon Cassandra Basics & Guidelines for AWS/EC2/VPC/EBS
 
Overview of PaaS: Java experience
Overview of PaaS: Java experienceOverview of PaaS: Java experience
Overview of PaaS: Java experience
 
Brief introduction to Kafka Streaming Platform
Brief introduction to Kafka Streaming PlatformBrief introduction to Kafka Streaming Platform
Brief introduction to Kafka Streaming Platform
 
.Net overviewrajnish
.Net overviewrajnish.Net overviewrajnish
.Net overviewrajnish
 
Www Kitebird Com Articles Pydbapi Html Toc 1
Www Kitebird Com Articles Pydbapi Html Toc 1Www Kitebird Com Articles Pydbapi Html Toc 1
Www Kitebird Com Articles Pydbapi Html Toc 1
 

Viewers also liked

构建高效能的Web网站 精选版-by-infoq
构建高效能的Web网站 精选版-by-infoq构建高效能的Web网站 精选版-by-infoq
构建高效能的Web网站 精选版-by-infoqRoger Xia
 
Mongo db实战
Mongo db实战Mongo db实战
Mongo db实战Roger Xia
 
Consistency-New-Generation-Databases
Consistency-New-Generation-DatabasesConsistency-New-Generation-Databases
Consistency-New-Generation-DatabasesRoger Xia
 
Ajax Lucence
Ajax LucenceAjax Lucence
Ajax LucenceRoger Xia
 
机器学习推动金融数据智能
机器学习推动金融数据智能机器学习推动金融数据智能
机器学习推动金融数据智能Roger Xia
 

Viewers also liked (6)

构建高效能的Web网站 精选版-by-infoq
构建高效能的Web网站 精选版-by-infoq构建高效能的Web网站 精选版-by-infoq
构建高效能的Web网站 精选版-by-infoq
 
JavaEE6
JavaEE6JavaEE6
JavaEE6
 
Mongo db实战
Mongo db实战Mongo db实战
Mongo db实战
 
Consistency-New-Generation-Databases
Consistency-New-Generation-DatabasesConsistency-New-Generation-Databases
Consistency-New-Generation-Databases
 
Ajax Lucence
Ajax LucenceAjax Lucence
Ajax Lucence
 
机器学习推动金融数据智能
机器学习推动金融数据智能机器学习推动金融数据智能
机器学习推动金融数据智能
 

Similar to Q con london2011-matthewwall-whyichosemongodbforguardiancouk

Why we chose mongodb for guardian.co.uk
Why we chose mongodb for guardian.co.ukWhy we chose mongodb for guardian.co.uk
Why we chose mongodb for guardian.co.ukGraham Tackley
 
Moving from Relational to Document Store
Moving from Relational to Document StoreMoving from Relational to Document Store
Moving from Relational to Document StoreGraham Tackley
 
MongoDB vs Mysql. A devops point of view
MongoDB vs Mysql. A devops point of viewMongoDB vs Mysql. A devops point of view
MongoDB vs Mysql. A devops point of viewPierre Baillet
 
MongoATL: How Sourceforge is Using MongoDB
MongoATL: How Sourceforge is Using MongoDBMongoATL: How Sourceforge is Using MongoDB
MongoATL: How Sourceforge is Using MongoDBRick Copeland
 
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...ScyllaDB
 
Silicon Valley Code Camp: 2011 Introduction to MongoDB
Silicon Valley Code Camp: 2011 Introduction to MongoDBSilicon Valley Code Camp: 2011 Introduction to MongoDB
Silicon Valley Code Camp: 2011 Introduction to MongoDBManish Pandit
 
DynamoDB Gluecon 2012
DynamoDB Gluecon 2012DynamoDB Gluecon 2012
DynamoDB Gluecon 2012Appirio
 
Gluecon 2012 - DynamoDB
Gluecon 2012 - DynamoDBGluecon 2012 - DynamoDB
Gluecon 2012 - DynamoDBJeff Douglas
 
Monogo db in-action
Monogo db in-actionMonogo db in-action
Monogo db in-actionChi Lee
 
NoSql presentation
NoSql presentationNoSql presentation
NoSql presentationMat Wall
 
2010 mongo berlin-scaling
2010 mongo berlin-scaling2010 mongo berlin-scaling
2010 mongo berlin-scalingMongoDB
 
Node Js, AngularJs and Express Js Tutorial
Node Js, AngularJs and Express Js TutorialNode Js, AngularJs and Express Js Tutorial
Node Js, AngularJs and Express Js TutorialPHP Support
 
No SQL at The Guardian
No SQL at The GuardianNo SQL at The Guardian
No SQL at The GuardianMat Wall
 
Mongo db transcript
Mongo db transcriptMongo db transcript
Mongo db transcriptfoliba
 
Using MongoDB to Build a Fast and Scalable Content Repository
Using MongoDB to Build a Fast and Scalable Content RepositoryUsing MongoDB to Build a Fast and Scalable Content Repository
Using MongoDB to Build a Fast and Scalable Content RepositoryMongoDB
 

Similar to Q con london2011-matthewwall-whyichosemongodbforguardiancouk (20)

Why we chose mongodb for guardian.co.uk
Why we chose mongodb for guardian.co.ukWhy we chose mongodb for guardian.co.uk
Why we chose mongodb for guardian.co.uk
 
Moving from Relational to Document Store
Moving from Relational to Document StoreMoving from Relational to Document Store
Moving from Relational to Document Store
 
MongoDB vs Mysql. A devops point of view
MongoDB vs Mysql. A devops point of viewMongoDB vs Mysql. A devops point of view
MongoDB vs Mysql. A devops point of view
 
MongoATL: How Sourceforge is Using MongoDB
MongoATL: How Sourceforge is Using MongoDBMongoATL: How Sourceforge is Using MongoDB
MongoATL: How Sourceforge is Using MongoDB
 
NoSQL Technology
NoSQL TechnologyNoSQL Technology
NoSQL Technology
 
Open source Technology
Open source TechnologyOpen source Technology
Open source Technology
 
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...
 
NoSql Databases
NoSql DatabasesNoSql Databases
NoSql Databases
 
MongoDB
MongoDBMongoDB
MongoDB
 
Silicon Valley Code Camp: 2011 Introduction to MongoDB
Silicon Valley Code Camp: 2011 Introduction to MongoDBSilicon Valley Code Camp: 2011 Introduction to MongoDB
Silicon Valley Code Camp: 2011 Introduction to MongoDB
 
DynamoDB Gluecon 2012
DynamoDB Gluecon 2012DynamoDB Gluecon 2012
DynamoDB Gluecon 2012
 
Gluecon 2012 - DynamoDB
Gluecon 2012 - DynamoDBGluecon 2012 - DynamoDB
Gluecon 2012 - DynamoDB
 
Monogo db in-action
Monogo db in-actionMonogo db in-action
Monogo db in-action
 
NoSql presentation
NoSql presentationNoSql presentation
NoSql presentation
 
2010 mongo berlin-scaling
2010 mongo berlin-scaling2010 mongo berlin-scaling
2010 mongo berlin-scaling
 
Node Js, AngularJs and Express Js Tutorial
Node Js, AngularJs and Express Js TutorialNode Js, AngularJs and Express Js Tutorial
Node Js, AngularJs and Express Js Tutorial
 
No SQL at The Guardian
No SQL at The GuardianNo SQL at The Guardian
No SQL at The Guardian
 
Mongo db transcript
Mongo db transcriptMongo db transcript
Mongo db transcript
 
MongoDb - Details on the POC
MongoDb - Details on the POCMongoDb - Details on the POC
MongoDb - Details on the POC
 
Using MongoDB to Build a Fast and Scalable Content Repository
Using MongoDB to Build a Fast and Scalable Content RepositoryUsing MongoDB to Build a Fast and Scalable Content Repository
Using MongoDB to Build a Fast and Scalable Content Repository
 

More from Roger Xia

Code reviews
Code reviewsCode reviews
Code reviewsRoger Xia
 
Python introduction
Python introductionPython introduction
Python introductionRoger Xia
 
Learning notes ruby
Learning notes rubyLearning notes ruby
Learning notes rubyRoger Xia
 
Converged open platform for enterprise
Converged open platform for enterpriseConverged open platform for enterprise
Converged open platform for enterpriseRoger Xia
 
Code reviews
Code reviewsCode reviews
Code reviewsRoger Xia
 
E commerce search strategies
E commerce search strategiesE commerce search strategies
E commerce search strategiesRoger Xia
 
Indefero source code_managment
Indefero source code_managmentIndefero source code_managment
Indefero source code_managmentRoger Xia
 
Web Services Atomic Transactio
 Web Services Atomic Transactio Web Services Atomic Transactio
Web Services Atomic TransactioRoger Xia
 
Web service through cxf
Web service through cxfWeb service through cxf
Web service through cxfRoger Xia
 
Spring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataSpring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataRoger Xia
 
Java explore
Java exploreJava explore
Java exploreRoger Xia
 
Ca siteminder
Ca siteminderCa siteminder
Ca siteminderRoger Xia
 
Fixing twitter
Fixing twitterFixing twitter
Fixing twitterRoger Xia
 
Eclipse plug in mylyn & tasktop
Eclipse plug in mylyn & tasktopEclipse plug in mylyn & tasktop
Eclipse plug in mylyn & tasktopRoger Xia
 
新浪微博架构猜想
新浪微博架构猜想新浪微博架构猜想
新浪微博架构猜想Roger Xia
 
MDD and modeling tools research
MDD and modeling tools researchMDD and modeling tools research
MDD and modeling tools researchRoger Xia
 
Secure Multi Tenancy In the Cloud
Secure Multi Tenancy In the CloudSecure Multi Tenancy In the Cloud
Secure Multi Tenancy In the CloudRoger Xia
 
Java programing considering performance
Java programing considering performanceJava programing considering performance
Java programing considering performanceRoger Xia
 

More from Roger Xia (20)

Code reviews
Code reviewsCode reviews
Code reviews
 
Python introduction
Python introductionPython introduction
Python introduction
 
Learning notes ruby
Learning notes rubyLearning notes ruby
Learning notes ruby
 
Converged open platform for enterprise
Converged open platform for enterpriseConverged open platform for enterprise
Converged open platform for enterprise
 
Code reviews
Code reviewsCode reviews
Code reviews
 
E commerce search strategies
E commerce search strategiesE commerce search strategies
E commerce search strategies
 
Saml
SamlSaml
Saml
 
Indefero source code_managment
Indefero source code_managmentIndefero source code_managment
Indefero source code_managment
 
Web Services Atomic Transactio
 Web Services Atomic Transactio Web Services Atomic Transactio
Web Services Atomic Transactio
 
Web service through cxf
Web service through cxfWeb service through cxf
Web service through cxf
 
Spring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataSpring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_data
 
Java explore
Java exploreJava explore
Java explore
 
Ca siteminder
Ca siteminderCa siteminder
Ca siteminder
 
Fixing twitter
Fixing twitterFixing twitter
Fixing twitter
 
Eclipse plug in mylyn & tasktop
Eclipse plug in mylyn & tasktopEclipse plug in mylyn & tasktop
Eclipse plug in mylyn & tasktop
 
新浪微博架构猜想
新浪微博架构猜想新浪微博架构猜想
新浪微博架构猜想
 
Jenkins
JenkinsJenkins
Jenkins
 
MDD and modeling tools research
MDD and modeling tools researchMDD and modeling tools research
MDD and modeling tools research
 
Secure Multi Tenancy In the Cloud
Secure Multi Tenancy In the CloudSecure Multi Tenancy In the Cloud
Secure Multi Tenancy In the Cloud
 
Java programing considering performance
Java programing considering performanceJava programing considering performance
Java programing considering performance
 

Recently uploaded

A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessWSO2
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Français Patch Tuesday - Avril
Français Patch Tuesday - AvrilFrançais Patch Tuesday - Avril
Français Patch Tuesday - AvrilIvanti
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...amber724300
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 

Recently uploaded (20)

A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with Platformless
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Français Patch Tuesday - Avril
Français Patch Tuesday - AvrilFrançais Patch Tuesday - Avril
Français Patch Tuesday - Avril
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 

Q con london2011-matthewwall-whyichosemongodbforguardiancouk

  • 1. Why I chose mongodb for guardian.co.uk Mat Wall Lead Software Architect, guardian.co.uk
  • 2. “It is not the strongest of the species that survives, nor the most intelligent. It is the one that is most adaptable to change.”
  • 3. Early Period circa ’95 The “Lash It Together” era
  • 4. Early Period (95, the “Lash It Together” era) Perl, CGI, apache Experimental Manual processes Bespoke software RDBMS, scripts & static files
  • 5. Mid Period circa ’00 The “Vendor CMS” era
  • 6. Mid Period: 2000s (The “Vendor CMS era”) Vignette / AOLserver TCL, Apache, Oracle Platform for online publishing Initially scales well with acceleration in delivery of features
  • 7. Mid Period: 2000s (The “Vendor CMS era”) Surprise! Vendor’s CMS doesn’t do what we want! Mish-mash in templates: HTML, JavaScript, TCL, SQL, PL-SQL No model in app tier, only in RDBMS schema created in Oracle Designer
  • 8. Mid Period: 2000s (The “Vendor CMS era”)
  • 9. Mid Period: 2000s (The “Vendor CMS era”)
  • 10. Mid Period: 2000s (The “Vendor CMS era”) After a few years, very difficult to extend Database schema becomes fixed due to dependencies in templates
  • 11. Mid Period: 2000s (The “Vendor CMS era”) If you can’t change the system:
  • 12. Modern Period circa ’05-09 The “J2EE Monolithic” era
  • 13.
  • 14. Web server Web server Web server I bring you NEWS!!! App server App server App server Oracle CMS Data feeds
  • 15. Web server Web server Web server Modern java app I bring you NEWS!!! App server App server App server Spring / Hibernate DDD / TDD Strong Oracle in java model Database abstracted away with ORM CMS Data feeds
  • 17. Each release involves schema upgrade Schema upgrade = downtime for journalists
  • 18. Complexity still increasing: 300+ tables, 10,000 lines of hibernate XML config 1,000 domain objects mapped to database 70,000 lines of domain object code Very tight binding to database
  • 19. ORM not really masking complexity: Database has strong influence on domain model: many domain objects made more complex mapping joins in RDBMS Complex hibernate features used, interceptors, proxies Complex caching strategy Lots of optimisations And: We still hand code complex queries in SQL!
  • 20. Load becoming an issue RDBMS difficult to scale
  • 21. Partial NoSQL circa ’09-10 The “Sticking Plaster” era
  • 22. Introduce yet more caching to patch up load problems Decouple applications from database by building APIs Power APIs using alternative, more scalable technologies APIs used to scale out database reads Writes still go to RDBMs
  • 23. Core Api Web servers Solr/API App server Solr/API Memcached (20Gb) Solr/API rdbms Solr Solr/API M/Q Solr/API CMS Cloud, EC2
  • 24. Content API Mutualised news! Apache Solr Read API delivered using Hosted in EC2 Document oriented search engine Loose schema: records, fields, facets Scales well for read operations
  • 25. Related content from Solr Introduction of memcached
  • 26. Mutualised news! We’ve solved our load problem (for now) but Increased our complexity
  • 27. Mutualised news! We now have 3 models! RDBMS tables Java Objects JSON API
  • 31. MutualisedAPI is very simple JSON news! Multiple domain concepts expressed in single document Can be designed in forwardly extensible way What if the JSON API was our primary model?
  • 32. Full NoSQL in development The “It’s the future!” era
  • 33. The first project: Identity Current login/registration system still in TCL/PL-SQL 3M+ users in relational database Very complex schema + PL-SQL New system required Can we migrate from Oracle to NoSql?
  • 34. Database selection Simple keystore. Too simple? Huge scalability. Do we need it? Schema design difficult. Simple to use, can execute similar queries to RDBMs
  • 35. MongoDB Mutualised news! database Document oriented Stores parsed JSON documents Can express complex queries Can be flexible about consistency Malleable schema: can easily change at runtime Can work at both large & small scales
  • 36. MongoDB concepts Mutualised news! RDBMS MongoDB Table Collection Row JSON Document Index Index Join Embedding & Linking Partition Shard
  • 39. Flexible Schema Mutualised news! Can easily represent different classes of tag as documents Both documents can be inserted into same collection Far simpler than equivalent hibernate mapped subclass configuration
  • 40. Flexible Schema Simple to query: Mutualised news!
  • 41. Flexible Schema Simple to query: Mutualised news! Query operators: $ne, $nin, $all, $exists, $gt, $lt, $gte ...
  • 45. Schema upgrades Mutualised news! Schema can be upgraded simply by upgrading the application version Application must deal with differing document versions Can become complex over time
  • 46. Schema upgrades Mutualised news! by: This can be mitigated Adding a “version” key to each document Updating the version each time the application modifies a document Using MapReduce capability to forcibly migrate documents from older versions if required
  • 47. Mongodb architecture mongod Single node Durability only possible in upcoming 1.8 release (databse fsync from buffer every min)
  • 48. Mongodb architecture master replicas mongod mongod mongod Replica set mongod mongod Can choose to read & Can choose to run reads write from master for full on slaves to scale reads consistency
  • 49. Mongodb architecture master replicas Durability achieved (<1.8) via replication mongod mongod Reads can be scaled out onto replicas mongod (eventual consistency) Replica set mongod All writes to master mongod If master fails, new master nominated by election Can choose to read & Can choose to accept dirty write from drivers handle most cluster complexity scale DB master for full reads from slaves to consistency reads
  • 50. Mongodb architecture Aggregator mongos consistent shard shard shard shard (master) replica replica replica replica inconsistent (replica) replica replica replica replica replica replica replica replica
  • 51. Mongodb architecture Writes scaled by sharding Aggregator mongos Shards populated by ranges consistent shard shard shard shard (master) mongos queries appropriate shard(s) Shards automatically balanced replica replica replica replica inconsistent (replica) replica replica replica replica Developers (essentially) unaware of shards replica replica replica replica
  • 52. Mongodb durability Relies (pre 1.8) on replication for durability 1.8 features optional journaling & redo logs Database users need to be cluster aware, each query can specify: No error checking / write confirmation Write confirmed on master Write replicated to N slave servers
  • 53. Old Idenity system Hundreds of tables & stored procedures Mutualised news! New Identity model User List Fields Text Dates Date/Time Statuses Boolean
  • 54.
  • 55. Very simple domain objects Simple, flexible objects No hibernate session
  • 56. Very simple domain objects Flexible schema embraced in domain object design
  • 57. Very simple domain objects Using casbah scala drivers = significant reduction in LOC vs SQL implementation
  • 58. Build API that can support both backends Registration app guardian.co.uk API MongoDB Oracle
  • 59. Build API that can support both backends Registration app guardian.co.uk API This bit is hard! MongoDB Oracle
  • 60. Migrate using API & decommision Registration app guardian.co.uk API MongoDB
  • 61. Add new stuff! Registration app guardian.co.uk API MongoDB Solr? Redis?
  • 62. MongoDB Simple, flexible schema with similar query & indexing to RDBMS Great at small or large scale Easy for developers to get going Commercial support available (10Gen) One day may power all of guardian.co.uk No transactions / joins: developers must cater for this Produces a net reduction in lines of code / complexity
  • 63. Shameless plug We’re hiring: http://www.careersatgnl.co.uk