NoSQL : Channeling the Data ExplosionDwight MerrimanCEO, 10gen@dmerr dmerr.tumblr.comGlueCon 2010
The database world is changingNo longer one-size-fits-all
NoSQL = Non-relational next generation operation data stores and databases
Scaling Outno joins +light transactional semantics =   horizontally scalable architectures
Why?cloudcommodityhttp://www.globalnerdy.com/2007/09/07/multicore-musings/
How the NoSQL Products VaryWhat’s the sameNo joinsNo complex transactionsWhat variesScale-out modelConsistency modelData model
Scaling Outdistribution & query modelsConsistent hashingOrder preserving range   chunkingScatter gather
Data modelsno joins +light transactional semantics =     horizontally scalable architecturesImportant side effect :     new data models =        improved ways to develop apps
Data ModelsKey/valueColumn-oriented “bigtable-style”Document-oriented (JSON)
Data Models{ title: ‘Too Big to Fail’,  author: ‘John S’,ts: Date(“05-Nov-09 10:33”),  comments: [ { author: 'Ian White',                 comment: 'Great article!' },              { author: 'Joe Smith',                 comment: 'But how fast is it?',                replies: [ {author: 'Jane Smith',                            comment: 'scalable?'} ]              }            ]  ],  tags: [‘finance’, ‘economy’]}
{ title: ‘Too Big to Fail’,  author: ‘John S’,ts: Date(“05-Nov-09 10:33”),  comments: [ { author: 'Ian White',                 comment: 'Great article!' },              { author: 'Joe Smith',                 comment: 'But how fast is it?',                replies: [ {author: 'Jane Smith',                            comment: 'scalable?'} ]              }            ]  ],  tags: [‘finance’, ‘economy’]}db.posts.find( { tags : ‘economy’ } ) .sort({ts:-1}).limit(10).skip(10)db.posts.find( { “comments.author” : “Ian White” } )
Influences
CAP    It is impossible in the asynchronous network model to implement a read/write data object that guarantees the following properties:• Availability• Atomic consistency in all fair executions (including those in which messages are lost).
Consistency Models - CAPChoices are AP or CPWrite Availability, not Read Availability, is the Main QuestionIt’s not all about CAPEventual consistency makes these non-availability aspects better:Multi data centerSpeedEven load distribution
Eventual Consistency
Eventual ConsistencyRead(x) : 1, 2, 2, 4, 4, 4, 4 …
Could we get this?Read(x) : 1, 2, 1, 4, 2, 4, 4, 4 …
TermsRWNR+W>N has nice propertiesSloppy quorum
R+W>NIf R+W > N, we can’t have both fast local reads and writes at the same time if all the data centers are equal peers?
Network Partitions
Trivial Network Partitions
Sometimes we need global state / more consistencyUnique key constraintsUser registrationACL changesAre we surprising the user?read-your-own-writes
Could it be the case that…uptime( CP + average developer )   >= uptime( AP + average developer )where uptime:= system is up and non-buggy?
PredictionsJSON will be the most popular building block for non-relational data modelsTunable consistency in all the productsSome SQL in these products!
Questions?Thank youdwight@10gen.com@dmerrdmerr.tumblr.com@mongodbDownload : www.mongodb.org10gen is hiring in SF and NYC – info@10gen.com

NOSQL Session GlueCon May 2010

  • 1.
    NoSQL : Channelingthe Data ExplosionDwight MerrimanCEO, 10gen@dmerr dmerr.tumblr.comGlueCon 2010
  • 2.
    The database worldis changingNo longer one-size-fits-all
  • 3.
    NoSQL = Non-relationalnext generation operation data stores and databases
  • 4.
    Scaling Outno joins+light transactional semantics = horizontally scalable architectures
  • 5.
  • 6.
    How the NoSQLProducts VaryWhat’s the sameNo joinsNo complex transactionsWhat variesScale-out modelConsistency modelData model
  • 7.
    Scaling Outdistribution &query modelsConsistent hashingOrder preserving range chunkingScatter gather
  • 8.
    Data modelsno joins+light transactional semantics = horizontally scalable architecturesImportant side effect : new data models = improved ways to develop apps
  • 9.
  • 10.
    Data Models{ title:‘Too Big to Fail’, author: ‘John S’,ts: Date(“05-Nov-09 10:33”), comments: [ { author: 'Ian White', comment: 'Great article!' }, { author: 'Joe Smith', comment: 'But how fast is it?', replies: [ {author: 'Jane Smith', comment: 'scalable?'} ] } ] ], tags: [‘finance’, ‘economy’]}
  • 11.
    { title: ‘TooBig to Fail’, author: ‘John S’,ts: Date(“05-Nov-09 10:33”), comments: [ { author: 'Ian White', comment: 'Great article!' }, { author: 'Joe Smith', comment: 'But how fast is it?', replies: [ {author: 'Jane Smith', comment: 'scalable?'} ] } ] ], tags: [‘finance’, ‘economy’]}db.posts.find( { tags : ‘economy’ } ) .sort({ts:-1}).limit(10).skip(10)db.posts.find( { “comments.author” : “Ian White” } )
  • 12.
  • 13.
    CAP It is impossible in the asynchronous network model to implement a read/write data object that guarantees the following properties:• Availability• Atomic consistency in all fair executions (including those in which messages are lost).
  • 14.
    Consistency Models -CAPChoices are AP or CPWrite Availability, not Read Availability, is the Main QuestionIt’s not all about CAPEventual consistency makes these non-availability aspects better:Multi data centerSpeedEven load distribution
  • 15.
  • 16.
    Eventual ConsistencyRead(x) :1, 2, 2, 4, 4, 4, 4 …
  • 17.
    Could we getthis?Read(x) : 1, 2, 1, 4, 2, 4, 4, 4 …
  • 18.
    TermsRWNR+W>N has nicepropertiesSloppy quorum
  • 19.
    R+W>NIf R+W >N, we can’t have both fast local reads and writes at the same time if all the data centers are equal peers?
  • 20.
  • 21.
  • 23.
    Sometimes we needglobal state / more consistencyUnique key constraintsUser registrationACL changesAre we surprising the user?read-your-own-writes
  • 24.
    Could it bethe case that…uptime( CP + average developer ) >= uptime( AP + average developer )where uptime:= system is up and non-buggy?
  • 25.
    PredictionsJSON will bethe most popular building block for non-relational data modelsTunable consistency in all the productsSome SQL in these products!
  • 26.
    Questions?Thank youdwight@10gen.com@dmerrdmerr.tumblr.com@mongodbDownload :www.mongodb.org10gen is hiring in SF and NYC – info@10gen.com