SlideShare a Scribd company logo
1 of 78
NOSQL 101
Or: How I Learned To Stop Worrying And Love The Mongo
Who Am I?
• Daniel Cousineau
• Sr. Software Applications Developer at
  Texas A&M University
• dcousineau@gmail.com
• Twitter: @dcousineau
• dcousineau.com
In the beginning... there
    was Accounting.
Row Based, Fixed Schema
The RDBMS was
created to address this
        usage.
RDBMS Ideology
• ACID
 • Atomicity
 • Consistency
 • Isolation
 • Durability
• All or nothing, no corruption, no mistakes
• Accounting errors are EXPENSIVE!
RDBMS Ideology


• Pessimistic
• Consistency at the end of EVERY step!
Moore’s Law happened.
Computers took on
more complex tasks...
Problems became...
     Dynamic
NOSQL attempts to
address the Dynamic.
What is NOSQL?
What Is NOSQL?


• Any data storage engine that does not use
  a SQL interface and does not use relational
  algebra
NOSQL Ideology
• BASE
 • Basically Available
 • Soft state
 • Eventually consistent
• Zen-like
• Content loss isn’t that big of a deal
NOSQL Ideology


• Optimistic
• State will be in flux, just accept it
NOSQL Ideology


• Don’t Do More Work Than You Have To
• Don’t Unnecessarily Duplicate Effort
NOSQL is diverse...
Types of NOSQL

• Key-Value Stores
 • memcache/memcachedb
 • riak
 • tokyo cabinet/tyrant
Types of NOSQL

• Column-oriented
 • dynamo
 • bigtable
 • cassandra
Types of NOSQL


• Graph
 • neo4j
Types of NOSQL

• Document-oriented
 • couchdb
 • MongoDb
Lets focus on
 MongoDB...
Why MongoDB?
• Because it’s what I need
• Because I understand it
• Because I’ve used it
• Because it’s easy
• Because it has superior driver support
• Because I said so
Support?
•   Operating Systems    • Official Drivers
    • OSX 32/64bit        • C, C++, Java,
                          JavaScript, Perl, PHP,
    • Linux 32/64bit      Python, Ruby
    • Windows 32/64bit • Community Drivers
    • Solaris i86pc     •     REST, C#, Clojure, ColdFusion,
                              Delphi, Erlang, Factor, Fantom, F#,

    • Solaris 64              Go, Groovy, Haskell, Lua, Obj-C,
                              PowerShell, Scala, Scheme, Smalltalk
What is MongoDB?
• A document-based storage system
• Databases contain collections, collections
  contain documents
• Documents are arbitrary BSON (extension
  of JSON) objects
• No schema is enforced
What is MongoDB?

• Drivers expose MongoDB query API to
  languages in a form familiar and native
• Drivers usually handle serialization
 • You always work in native system
    objects, BSON is really only used
    internally
Install MongoDB

$ wget http://fastdl.mongodb.org/osx/mongodb-osx-x86_64-1.6.0.tgz
$ tar -xzvf ./mongodb-osx-x86_64-1.6.0.tgz
$ ./mongodb-osx-x86_64-1.6.0/bin/mongod --dbpath=/path/to/save/db
Manage MongoDB
$ ./mongodb-osx-x86_64-1.6.0/bin/mongo
MongoDB shell version: 1.6.0
connecting to: test
> db.foo.save( {a:1} )
> db.foo.find()
{ "_id" : ObjectId("4c60d0143cd09f6d17a18094"), "a" : 1 }
>
Simple Queries
$ ./mongodb-osx-x86_64-1.6.0/bin/mongo
MongoDB shell version: 1.6.0
connecting to: test
> db.foo.find()
Get All Records
> db.foo.find( {a: 1} )
Get All Records Where Property ‘a’ Is 1
> db.foo.find( {a: 1}, {_id: 1} )
Get The ‘_id’ Property Of All Records Where ‘a’ Is 1
> db.foo.find().limit( 1 )
Get 1 Record
> db.foo.find().sort( {a: -1} )
Get All Records Sorted By Property ‘a’ In Reverse
>
Some Common
  Questions
So I should chose
MongoDB over MySQL?

• Bad Question!
• 90% of the time you’ll probably implement
  a hybrid system.
When should I use
     MongoDB?
• When an ORM is necessary
 • It’s in the name, Object-Relational
    Mapper
• When you use a metric ton of 1:n and n:m
  tables to generate 1 object
  • And you rarely if ever use them for
    reporting
MongoDB performance
     is better?
• Too simple of a question
• Performance comparable, MySQL usually
  wins sheer query speed
 • Sterilized Lab Test
• MongoDB usually wins due to fewer
  queries required and no object reassembly
 • The Real World
Can MongoDB enforce
     a schema?
• You can add indexes on arbitrary
  keypatterns
• Otherwise, why?
 • Application is responsible for correctness
    and error handling, no need to duplicate
Can I trust eventual
     consistency?

• No, but you shouldn’t trust ACID either
• Build your application to be flexible and to
  handle consistency issues
  • Stale data is a fact of life
Can MongoDB Handle
 Large Deployments?
   1.2 TB over 5 billion records

   600+ million documents

  Migrating ENTIRE app from Postgres




              http://www.mongodb.org/display/DOCS/Production+Deployments
Can MongoDB Handle
 Large Deployments?
• huMONGOousDB
• 32-bit only supports ~2.5GB
 • Memory-mapped files
• Individual documents limited to 4MB
Why waste time with
     theory?
2 Case Studies
The ‘Have Done’




  http://orthochronos.com
Very Simple

• Daemon does insertion in the background
• Front end just does simple grabs
 • Grab 1, Grab Many, etc.
Data Model
• System has Stocks
• Each Stock has Daily (once per day) and
  IntraDaily (every 15 minutes) data
  • Limited to trading hours
• Each set of data (daily or intradaily) has 4
  Graphs
• Each graph has upwards of 6 Lines
• Each line has between 300 to 800 Points
Data Model
• With each data point representing 1
  minute, each 15-minute IntraDay graph will
  have about 785 overlapping points with the
  preceding graph
• Why not consolidate into a single running
  table, and just SELECT ... LIMIT 800
  points from X timestamp?
Data Model

• The Algorithm will cause past points to
  change
• But each graph should be preserved so one
  can see the historical evolution of a given
  curve
Data Model


• Now imagine implementing these
  requirements in a traditional RDBMS
Data Model


• Instead, lets see my MongoDB
  implementation
Database Layout
~/MongoDB/bin $ ./mongo
MongoDB shell version: 1.6.0
connecting to: test
> use DBNAME
switched to db DBNAME
> show collections
aapl
aapl.daily
aapl.intraday
acas
acas.daily
acas.intraday
...
wfr
wfr.daily
wfr.intraday
x
x.daily
x.intraday
Stock ‘Metadata’


{ "_id" : ObjectId("4c5a15038ead0eec04000000"), "timestamp" : 1228995000,
"data" : [ "0", "92", "0" ] }
Interval Data
{ "_id" : ObjectId("4bb901cc8ead0e041a0d0000"), "timestamp" : 1228994100, "number" : 3, "charts" : [
    {
        "number" : "1",
        "data" : [
            -99999,
            -99999,
            -99999,
            -99999,
            -99999
        ],
        "domainLength" : 300,
        "domainDates" : [
            "Tue Nov 25 10:45:00 GMT-0500 2008",
                  ...
        ],
        "lines" : [
            {
                "first" : 76,
                "last" : 300,
                "points" : [
                      {
                          "index" : 1,
                          "value" : 0
                      },
                      { ... }
                ]
            }, { ... }, { ... }, { ... }, { ... }, { ... }
        ]
    }, { ... }, { ... }, { ... }
] }
Connect
/**
  * @return MongoDB
  */
protected static function _getDb()
{
     static $db;

    if( !isset($db) )
    {
        $mongo = new Mongo();
        $db = $mongo->selectDb('...db name...');
    }

    return $db;
}
SELECT TOP 1...
//...

$db = self::_getDb();

$collection = $db->selectCollection(strtolower($symbol));

$dti = $collection->find()
                  ->sort(array(
                      'timestamp' => -1
                  ))
                  ->limit(1)
                  ->getNext();

//...
Get Specific Timestamp
//...

$tstamp = strtotime($lastTimestamp);

$cur = $collection->find(array(
                      'timestamp' => $tstamp
                  ))
                  ->limit(1);

//...
Only Get Timestamps
//...

$dailyCollection = $db->selectCollection(strtolower($symbol).'.daily');

$dailyCur = $dailyCollection->find(array(), array('timestamp'))
                            ->sort(array(
                                'timestamp' => 1
                            ));

foreach( $dailyCur as $timestamp ) {
    //...
}

//...
Utilizing Collections
//...

$db = self::_getDb();

$stocks = array();
foreach( $db->listCollections() as $collection )
{
    $collection_name = strtolower($collection->getName());

    if( preg_match('#^[a-z0-9]+$#i', $collection_name) )
    {
        $collection_name = strtoupper($collection_name);

        $stocks[] = $collection_name;
    }
}

sort($stocks);

//...
The ‘Wish I Did’
Not Too Terrible
• Keep track of Student Cases
 • A case keeps track of demographics,
    diagnoses, disabilities, notes, schedule, etc.
• Also tracks Exams
 • Schedule multiple exams per course
• Finally, students can log into a portal,
  counselors can log in
  • Basic user management
Mostly Static


• Most information display only
• Even with reporting
So I used good old
fashioned RDBMS
      design...
lolwut?
Instead...

• A collection for Student Cases
• A collection for Courses
• etc...
• Denormalize!
Boyce-Codd Who?
{
             A Student Document
    "_id":ObjectId("4c5f572e8ead0ed00d0f0000"),
    "uin":"485596916",                                "disabilities":[
    "firstname":"Zach",                                  "Blind",
    "middleinitial":"I",                                 "Asthma"
    "lastname":"Hill",                                ],
    "major":"ACCT",                                   "casenotes":[
    "classification":"G4",                               ...
    "registrationdate":"2008-03-09",                     {
    "renewaldates":[                                         "counselor":"Zander King",
       ...                                                   "note":"lorem ipsum bibendum enim ..."
       {                                                 }
           "semester":"201021",                       ],
           "date":"2010-05-15"                        "schedules":[
       }                                                 {
    ],                                                       "semester":"201021",
    "localaddress":{                                         "courses":[
       "street":"9116 View",                                    ...
       "city":"College Station",                                {
       "state":"TX",                                                "$ref":"courses",
       "zip":"77840"                                                "$id":ObjectId(...)
    },                                                          }
    "permanentaddress":{                                     ]
       "street":"3960 Lake",                             }
       "city":"Los Angeles",                          ]
       "state":"CA",                              }
       "zip":"90001"
    },
A Course Document

  {
      "_id" : ObjectId("4c5f572e8ead0ed00d000000"),
      "subject" : "MECH",
      "course" : 175,
      "section" : 506,
      "faculty" : "Dr. Quagmire Hill"
  }
I lied...
MongoDB DBRef

• Similar to FK
• Requires driver support
• Not query-able
But can we do
everything we need?
Paginate?
>   db.students.find( {}, {_id:1} ).skip( 20 ).limit( 10 )
{   "_id" : ObjectId("4c5f572e8ead0ed00d230000") }
{   "_id" : ObjectId("4c5f572e8ead0ed00d240000") }
{   "_id" : ObjectId("4c5f572e8ead0ed00d250000") }
{   "_id" : ObjectId("4c5f572e8ead0ed00d260000") }
{   "_id" : ObjectId("4c5f572e8ead0ed00d270000") }
{   "_id" : ObjectId("4c5f572e8ead0ed00d280000") }
{   "_id" : ObjectId("4c5f572e8ead0ed00d290000") }
{   "_id" : ObjectId("4c5f572e8ead0ed00d2a0000") }
{   "_id" : ObjectId("4c5f572e8ead0ed00d2b0000") }
{   "_id" : ObjectId("4c5f572e8ead0ed00d2c0000") }
>
Only Renewed Once?

> db.students.find( {renewaldates: {$size: 1}}, {_id:1, renewaldates:1} )
{   "_id"   :   ObjectId("4c5f572e8ead0ed00d150000"),   "renewaldates"   :   [   {   "semester"   :   "201031",   "date"   :   "2010-08-16"   }   ]   }
{   "_id"   :   ObjectId("4c5f572e8ead0ed00d190000"),   "renewaldates"   :   [   {   "semester"   :   "201031",   "date"   :   "2010-08-16"   }   ]   }
{   "_id"   :   ObjectId("4c5f572e8ead0ed00d440000"),   "renewaldates"   :   [   {   "semester"   :   "201021",   "date"   :   "2010-05-15"   }   ]   }
{   "_id"   :   ObjectId("4c5f572e8ead0ed00d460000"),   "renewaldates"   :   [   {   "semester"   :   "201031",   "date"   :   "2010-08-16"   }   ]   }
{   "_id"   :   ObjectId("4c5f572e8ead0ed00d4e0000"),   "renewaldates"   :   [   {   "semester"   :   "201031",   "date"   :   "2010-08-16"   }   ]   }
{   "_id"   :   ObjectId("4c5f572e8ead0ed00d660000"),   "renewaldates"   :   [   {   "semester"   :   "201031",   "date"   :   "2010-08-16"   }   ]   }
{   "_id"   :   ObjectId("4c5f572e8ead0ed00d6f0000"),   "renewaldates"   :   [   {   "semester"   :   "201031",   "date"   :   "2010-08-16"   }   ]   }
{   "_id"   :   ObjectId("4c5f572e8ead0ed00d720000"),   "renewaldates"   :   [   {   "semester"   :   "201031",   "date"   :   "2010-08-16"   }   ]   }
{   "_id"   :   ObjectId("4c5f572e8ead0ed00d750000"),   "renewaldates"   :   [   {   "semester"   :   "201031",   "date"   :   "2010-08-16"   }   ]   }
{   "_id"   :   ObjectId("4c5f572e8ead0ed00d800000"),   "renewaldates"   :   [   {   "semester"   :   "201021",   "date"   :   "2010-05-15"   }   ]   }
{   "_id"   :   ObjectId("4c5f572e8ead0ed00d880000"),   "renewaldates"   :   [   {   "semester"   :   "201031",   "date"   :   "2010-08-16"   }   ]   }
{   "_id"   :   ObjectId("4c5f572e8ead0ed00d8d0000"),   "renewaldates"   :   [   {   "semester"   :   "201021",   "date"   :   "2010-05-15"   }   ]   }
{   "_id"   :   ObjectId("4c5f572e8ead0ed00d8f0000"),   "renewaldates"   :   [   {   "semester"   :   "201031",   "date"   :   "2010-08-16"   }   ]   }
{   "_id"   :   ObjectId("4c5f572e8ead0ed00d940000"),   "renewaldates"   :   [   {   "semester"   :   "201031",   "date"   :   "2010-08-16"   }   ]   }
>
How Many Blind
           Asthmatics?

> db.students.find( {disabilities: {$in: ['Blind', 'Asthma']}} ).count()
76
>
Who Lives On Elm?

>   db.students.find( {'localaddress.street': /.*Elm/}, {_id:1, 'localaddress.street':1} )
{   "_id" : ObjectId("4c5f572e8ead0ed00d220000"), "localaddress" : { "street" : "2807 Elm"   }   }
{   "_id" : ObjectId("4c5f572e8ead0ed00d290000"), "localaddress" : { "street" : "5762 Elm"   }   }
{   "_id" : ObjectId("4c5f572e8ead0ed00d400000"), "localaddress" : { "street" : "6261 Elm"   }   }
{   "_id" : ObjectId("4c5f572e8ead0ed00d610000"), "localaddress" : { "street" : "7099 Elm"   }   }
{   "_id" : ObjectId("4c5f572e8ead0ed00d930000"), "localaddress" : { "street" : "4994 Elm"   }   }
{   "_id" : ObjectId("4c5f572e8ead0ed00d960000"), "localaddress" : { "street" : "3456 Elm"   }   }
>
Number Registered On
   Or Before 2010/06/20?
> db.students
    .find( {$where: "new Date(this.registrationdate) <= new Date('2010/06/20')"} )
    .count()
136
>
Update Classification?

> db.students.find( {uin:'735383393'}, {_id: 1, uin: 1, classification: 1} )
{ "_id" : ObjectId("4c60dae48ead0e143e0f0000"), "uin" : "735383393", "classification" : "G2" }

> db.students.update( {uin:'735383393'}, {$set: {classification: 'G3'}} )

> db.students.find( {uin:'735383393'}, {_id: 1, uin: 1, classification: 1} )
{ "_id" : ObjectId("4c60dae48ead0e143e0f0000"), "uin" : "735383393", "classification" : "G3" }
>
Home Towns?
> db.students.distinct('permanentaddress.city')
[
    "Austin",
    "Chicago",
    "Dallas",
    "Denver",
    "Houston",
    "Los Angeles",
    "Lubbock",
    "New York"
]
>
Number of students by
      major?
  > db.students
      .group({
          key: {major:true},
          cond: {major: {$exists: true}},
          reduce: function(obj, prev) { prev.count += 1; },
          initial: { count: 0 }
      })
  [
      {"major" : "CPSC", "count" : 12},
      {"major" : "MECH", "count" : 16},
      {"major" : "ACCT", "count" : 18},
      {"major" : "MGMT", "count" : 18},
      {"major" : "FINC", "count" : 16},
      {"major" : "ENDS", "count" : 15},
      {"major" : "ARCH", "count" : 18},
      {"major" : "ENGL", "count" : 15},
      {"major" : "POLS", "count" : 22}
  ]
  >
How Many Students In
        A Course?
> db.students
    .find({'schedules.courses': {$in: [
         new DBRef('courses', new ObjectId('4c60dae48ead0e143e000000'))
    ]}})
    .count()
25
No Time To Cover...

• Map-Reduce
 • MongoDB has it, and it is extremely
    powerful
• GridFS
 • Store files/blobs
• Sharding/Replica Pairs/Master-Slave
Q&A
Resources
•   BASE: An Acid Alternative
    http://queue.acm.org/detail.cfm?id=1394128

•   PHP Mongo Driver Reference
    http://php.net/mongo

•   MongoDB Advance Query Reference
    http://www.mongodb.org/display/DOCS/Advanced+Queries

•   MongoDB Query Cheat Sheet
    http://www.10gen.com/reference

•   myNoSQL Blog
    http://nosql.mypopescu.com/

More Related Content

What's hot

Building a Scalable Inbox System with MongoDB and Java
Building a Scalable Inbox System with MongoDB and JavaBuilding a Scalable Inbox System with MongoDB and Java
Building a Scalable Inbox System with MongoDB and Javaantoinegirbal
 
MongoDB Performance Tuning
MongoDB Performance TuningMongoDB Performance Tuning
MongoDB Performance TuningPuneet Behl
 
MongoDB Performance Tuning
MongoDB Performance TuningMongoDB Performance Tuning
MongoDB Performance TuningMongoDB
 
Solr & Lucene @ Etsy by Gregg Donovan
Solr & Lucene @ Etsy by Gregg DonovanSolr & Lucene @ Etsy by Gregg Donovan
Solr & Lucene @ Etsy by Gregg DonovanGregg Donovan
 
Using JSON with MariaDB and MySQL
Using JSON with MariaDB and MySQLUsing JSON with MariaDB and MySQL
Using JSON with MariaDB and MySQLAnders Karlsson
 
Advanced Redis data structures
Advanced Redis data structuresAdvanced Redis data structures
Advanced Redis data structuresamix3k
 
Zend Framework 1 + Doctrine 2
Zend Framework 1 + Doctrine 2Zend Framework 1 + Doctrine 2
Zend Framework 1 + Doctrine 2Ralph Schindler
 
Indexing and Query Optimizer (Mongo Austin)
Indexing and Query Optimizer (Mongo Austin)Indexing and Query Optimizer (Mongo Austin)
Indexing and Query Optimizer (Mongo Austin)MongoDB
 
10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data ModelingDATAVERSITY
 
PHP Development with MongoDB (Fitz Agard)
PHP Development with MongoDB (Fitz Agard)PHP Development with MongoDB (Fitz Agard)
PHP Development with MongoDB (Fitz Agard)MongoSF
 
PHP Development With MongoDB
PHP Development With MongoDBPHP Development With MongoDB
PHP Development With MongoDBFitz Agard
 
Webinar: MongoDB 2.4 Feature Demo and Q&A on Hash-based Sharding
Webinar: MongoDB 2.4 Feature Demo and Q&A on Hash-based ShardingWebinar: MongoDB 2.4 Feature Demo and Q&A on Hash-based Sharding
Webinar: MongoDB 2.4 Feature Demo and Q&A on Hash-based ShardingMongoDB
 
Easy undo.key
Easy undo.keyEasy undo.key
Easy undo.keyzachwaugh
 
NoSQL - An introduction to CouchDB
NoSQL - An introduction to CouchDBNoSQL - An introduction to CouchDB
NoSQL - An introduction to CouchDBJonathan Weiss
 
Paradigmas de programação funcional + objetos no liquidificador com scala
Paradigmas de programação funcional + objetos no liquidificador com scalaParadigmas de programação funcional + objetos no liquidificador com scala
Paradigmas de programação funcional + objetos no liquidificador com scalaBruno Oliveira
 
Indexing and Query Optimization
Indexing and Query OptimizationIndexing and Query Optimization
Indexing and Query OptimizationMongoDB
 
Schema Design with MongoDB
Schema Design with MongoDBSchema Design with MongoDB
Schema Design with MongoDBrogerbodamer
 
MongoDB World 2016: Deciphering .explain() Output
MongoDB World 2016: Deciphering .explain() OutputMongoDB World 2016: Deciphering .explain() Output
MongoDB World 2016: Deciphering .explain() OutputMongoDB
 
MongoDB Performance Debugging
MongoDB Performance DebuggingMongoDB Performance Debugging
MongoDB Performance DebuggingMongoDB
 
[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자
[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자
[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자Donghyeok Kang
 

What's hot (20)

Building a Scalable Inbox System with MongoDB and Java
Building a Scalable Inbox System with MongoDB and JavaBuilding a Scalable Inbox System with MongoDB and Java
Building a Scalable Inbox System with MongoDB and Java
 
MongoDB Performance Tuning
MongoDB Performance TuningMongoDB Performance Tuning
MongoDB Performance Tuning
 
MongoDB Performance Tuning
MongoDB Performance TuningMongoDB Performance Tuning
MongoDB Performance Tuning
 
Solr & Lucene @ Etsy by Gregg Donovan
Solr & Lucene @ Etsy by Gregg DonovanSolr & Lucene @ Etsy by Gregg Donovan
Solr & Lucene @ Etsy by Gregg Donovan
 
Using JSON with MariaDB and MySQL
Using JSON with MariaDB and MySQLUsing JSON with MariaDB and MySQL
Using JSON with MariaDB and MySQL
 
Advanced Redis data structures
Advanced Redis data structuresAdvanced Redis data structures
Advanced Redis data structures
 
Zend Framework 1 + Doctrine 2
Zend Framework 1 + Doctrine 2Zend Framework 1 + Doctrine 2
Zend Framework 1 + Doctrine 2
 
Indexing and Query Optimizer (Mongo Austin)
Indexing and Query Optimizer (Mongo Austin)Indexing and Query Optimizer (Mongo Austin)
Indexing and Query Optimizer (Mongo Austin)
 
10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling
 
PHP Development with MongoDB (Fitz Agard)
PHP Development with MongoDB (Fitz Agard)PHP Development with MongoDB (Fitz Agard)
PHP Development with MongoDB (Fitz Agard)
 
PHP Development With MongoDB
PHP Development With MongoDBPHP Development With MongoDB
PHP Development With MongoDB
 
Webinar: MongoDB 2.4 Feature Demo and Q&A on Hash-based Sharding
Webinar: MongoDB 2.4 Feature Demo and Q&A on Hash-based ShardingWebinar: MongoDB 2.4 Feature Demo and Q&A on Hash-based Sharding
Webinar: MongoDB 2.4 Feature Demo and Q&A on Hash-based Sharding
 
Easy undo.key
Easy undo.keyEasy undo.key
Easy undo.key
 
NoSQL - An introduction to CouchDB
NoSQL - An introduction to CouchDBNoSQL - An introduction to CouchDB
NoSQL - An introduction to CouchDB
 
Paradigmas de programação funcional + objetos no liquidificador com scala
Paradigmas de programação funcional + objetos no liquidificador com scalaParadigmas de programação funcional + objetos no liquidificador com scala
Paradigmas de programação funcional + objetos no liquidificador com scala
 
Indexing and Query Optimization
Indexing and Query OptimizationIndexing and Query Optimization
Indexing and Query Optimization
 
Schema Design with MongoDB
Schema Design with MongoDBSchema Design with MongoDB
Schema Design with MongoDB
 
MongoDB World 2016: Deciphering .explain() Output
MongoDB World 2016: Deciphering .explain() OutputMongoDB World 2016: Deciphering .explain() Output
MongoDB World 2016: Deciphering .explain() Output
 
MongoDB Performance Debugging
MongoDB Performance DebuggingMongoDB Performance Debugging
MongoDB Performance Debugging
 
[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자
[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자
[제1회 루씬 한글분석기 기술세미나] solr로 나만의 검색엔진을 만들어보자
 

Similar to NOSQL101, Or: How I Learned To Stop Worrying And Love The Mongo!

MongoDB at ZPUGDC
MongoDB at ZPUGDCMongoDB at ZPUGDC
MongoDB at ZPUGDCMike Dirolf
 
Mongodb intro
Mongodb introMongodb intro
Mongodb introchristkv
 
MongoDB: a gentle, friendly overview
MongoDB: a gentle, friendly overviewMongoDB: a gentle, friendly overview
MongoDB: a gentle, friendly overviewAntonio Pintus
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsServer Density
 
MongoDB at Scale
MongoDB at ScaleMongoDB at Scale
MongoDB at ScaleMongoDB
 
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...MongoDB
 
Big data analytics with Spark & Cassandra
Big data analytics with Spark & Cassandra Big data analytics with Spark & Cassandra
Big data analytics with Spark & Cassandra Matthias Niehoff
 
Migrating from matlab to python
Migrating from matlab to pythonMigrating from matlab to python
Migrating from matlab to pythonActiveState
 
Why databases cry at night
Why databases cry at nightWhy databases cry at night
Why databases cry at nightMichael Yarichuk
 
MYSQL Query Anti-Patterns That Can Be Moved to Sphinx
MYSQL Query Anti-Patterns That Can Be Moved to SphinxMYSQL Query Anti-Patterns That Can Be Moved to Sphinx
MYSQL Query Anti-Patterns That Can Be Moved to SphinxPythian
 
Scaling MongoDB
Scaling MongoDBScaling MongoDB
Scaling MongoDBMongoDB
 
PostgreSQLからMongoDBへ
PostgreSQLからMongoDBへPostgreSQLからMongoDBへ
PostgreSQLからMongoDBへBasuke Suzuki
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBSean Laurent
 
MongoDB at FrozenRails
MongoDB at FrozenRailsMongoDB at FrozenRails
MongoDB at FrozenRailsMike Dirolf
 
Seedhack MongoDB 2011
Seedhack MongoDB 2011Seedhack MongoDB 2011
Seedhack MongoDB 2011Rainforest QA
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBantoinegirbal
 
2011 Mongo FR - MongoDB introduction
2011 Mongo FR - MongoDB introduction2011 Mongo FR - MongoDB introduction
2011 Mongo FR - MongoDB introductionantoinegirbal
 
Buildingsocialanalyticstoolwithmongodb
BuildingsocialanalyticstoolwithmongodbBuildingsocialanalyticstoolwithmongodb
BuildingsocialanalyticstoolwithmongodbMongoDB APAC
 

Similar to NOSQL101, Or: How I Learned To Stop Worrying And Love The Mongo! (20)

MongoDB at ZPUGDC
MongoDB at ZPUGDCMongoDB at ZPUGDC
MongoDB at ZPUGDC
 
Mongodb intro
Mongodb introMongodb intro
Mongodb intro
 
MongoDB: a gentle, friendly overview
MongoDB: a gentle, friendly overviewMongoDB: a gentle, friendly overview
MongoDB: a gentle, friendly overview
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & Analytics
 
MongoDB at Scale
MongoDB at ScaleMongoDB at Scale
MongoDB at Scale
 
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
 
Big data analytics with Spark & Cassandra
Big data analytics with Spark & Cassandra Big data analytics with Spark & Cassandra
Big data analytics with Spark & Cassandra
 
Migrating from matlab to python
Migrating from matlab to pythonMigrating from matlab to python
Migrating from matlab to python
 
Why databases cry at night
Why databases cry at nightWhy databases cry at night
Why databases cry at night
 
Full metal mongo
Full metal mongoFull metal mongo
Full metal mongo
 
MYSQL Query Anti-Patterns That Can Be Moved to Sphinx
MYSQL Query Anti-Patterns That Can Be Moved to SphinxMYSQL Query Anti-Patterns That Can Be Moved to Sphinx
MYSQL Query Anti-Patterns That Can Be Moved to Sphinx
 
Scaling MongoDB
Scaling MongoDBScaling MongoDB
Scaling MongoDB
 
PostgreSQLからMongoDBへ
PostgreSQLからMongoDBへPostgreSQLからMongoDBへ
PostgreSQLからMongoDBへ
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
MongoDB at FrozenRails
MongoDB at FrozenRailsMongoDB at FrozenRails
MongoDB at FrozenRails
 
Seedhack MongoDB 2011
Seedhack MongoDB 2011Seedhack MongoDB 2011
Seedhack MongoDB 2011
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
2011 Mongo FR - MongoDB introduction
2011 Mongo FR - MongoDB introduction2011 Mongo FR - MongoDB introduction
2011 Mongo FR - MongoDB introduction
 
Buildingsocialanalyticstoolwithmongodb
BuildingsocialanalyticstoolwithmongodbBuildingsocialanalyticstoolwithmongodb
Buildingsocialanalyticstoolwithmongodb
 
MongoDB 3.0
MongoDB 3.0 MongoDB 3.0
MongoDB 3.0
 

More from Daniel Cousineau

Git, an Illustrated Primer
Git, an Illustrated PrimerGit, an Illustrated Primer
Git, an Illustrated PrimerDaniel Cousineau
 
Disregard Inputs, Acquire Zend_Form
Disregard Inputs, Acquire Zend_FormDisregard Inputs, Acquire Zend_Form
Disregard Inputs, Acquire Zend_FormDaniel Cousineau
 
jQuery Mobile: For Fun and Profit
jQuery Mobile: For Fun and ProfitjQuery Mobile: For Fun and Profit
jQuery Mobile: For Fun and ProfitDaniel Cousineau
 
Automated Deployment With Phing
Automated Deployment With PhingAutomated Deployment With Phing
Automated Deployment With PhingDaniel Cousineau
 

More from Daniel Cousineau (6)

Git, an Illustrated Primer
Git, an Illustrated PrimerGit, an Illustrated Primer
Git, an Illustrated Primer
 
JavaScript Primer
JavaScript PrimerJavaScript Primer
JavaScript Primer
 
JavaScript Primer
JavaScript PrimerJavaScript Primer
JavaScript Primer
 
Disregard Inputs, Acquire Zend_Form
Disregard Inputs, Acquire Zend_FormDisregard Inputs, Acquire Zend_Form
Disregard Inputs, Acquire Zend_Form
 
jQuery Mobile: For Fun and Profit
jQuery Mobile: For Fun and ProfitjQuery Mobile: For Fun and Profit
jQuery Mobile: For Fun and Profit
 
Automated Deployment With Phing
Automated Deployment With PhingAutomated Deployment With Phing
Automated Deployment With Phing
 

Recently uploaded

Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 

Recently uploaded (20)

Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 

NOSQL101, Or: How I Learned To Stop Worrying And Love The Mongo!

  • 1. NOSQL 101 Or: How I Learned To Stop Worrying And Love The Mongo
  • 2. Who Am I? • Daniel Cousineau • Sr. Software Applications Developer at Texas A&M University • dcousineau@gmail.com • Twitter: @dcousineau • dcousineau.com
  • 3. In the beginning... there was Accounting.
  • 5. The RDBMS was created to address this usage.
  • 6. RDBMS Ideology • ACID • Atomicity • Consistency • Isolation • Durability • All or nothing, no corruption, no mistakes • Accounting errors are EXPENSIVE!
  • 7. RDBMS Ideology • Pessimistic • Consistency at the end of EVERY step!
  • 9. Computers took on more complex tasks...
  • 13. What Is NOSQL? • Any data storage engine that does not use a SQL interface and does not use relational algebra
  • 14. NOSQL Ideology • BASE • Basically Available • Soft state • Eventually consistent • Zen-like • Content loss isn’t that big of a deal
  • 15. NOSQL Ideology • Optimistic • State will be in flux, just accept it
  • 16. NOSQL Ideology • Don’t Do More Work Than You Have To • Don’t Unnecessarily Duplicate Effort
  • 18. Types of NOSQL • Key-Value Stores • memcache/memcachedb • riak • tokyo cabinet/tyrant
  • 19. Types of NOSQL • Column-oriented • dynamo • bigtable • cassandra
  • 20. Types of NOSQL • Graph • neo4j
  • 21. Types of NOSQL • Document-oriented • couchdb • MongoDb
  • 22. Lets focus on MongoDB...
  • 23. Why MongoDB? • Because it’s what I need • Because I understand it • Because I’ve used it • Because it’s easy • Because it has superior driver support • Because I said so
  • 24. Support? • Operating Systems • Official Drivers • OSX 32/64bit • C, C++, Java, JavaScript, Perl, PHP, • Linux 32/64bit Python, Ruby • Windows 32/64bit • Community Drivers • Solaris i86pc • REST, C#, Clojure, ColdFusion, Delphi, Erlang, Factor, Fantom, F#, • Solaris 64 Go, Groovy, Haskell, Lua, Obj-C, PowerShell, Scala, Scheme, Smalltalk
  • 25. What is MongoDB? • A document-based storage system • Databases contain collections, collections contain documents • Documents are arbitrary BSON (extension of JSON) objects • No schema is enforced
  • 26. What is MongoDB? • Drivers expose MongoDB query API to languages in a form familiar and native • Drivers usually handle serialization • You always work in native system objects, BSON is really only used internally
  • 27. Install MongoDB $ wget http://fastdl.mongodb.org/osx/mongodb-osx-x86_64-1.6.0.tgz $ tar -xzvf ./mongodb-osx-x86_64-1.6.0.tgz $ ./mongodb-osx-x86_64-1.6.0/bin/mongod --dbpath=/path/to/save/db
  • 28. Manage MongoDB $ ./mongodb-osx-x86_64-1.6.0/bin/mongo MongoDB shell version: 1.6.0 connecting to: test > db.foo.save( {a:1} ) > db.foo.find() { "_id" : ObjectId("4c60d0143cd09f6d17a18094"), "a" : 1 } >
  • 29. Simple Queries $ ./mongodb-osx-x86_64-1.6.0/bin/mongo MongoDB shell version: 1.6.0 connecting to: test > db.foo.find() Get All Records > db.foo.find( {a: 1} ) Get All Records Where Property ‘a’ Is 1 > db.foo.find( {a: 1}, {_id: 1} ) Get The ‘_id’ Property Of All Records Where ‘a’ Is 1 > db.foo.find().limit( 1 ) Get 1 Record > db.foo.find().sort( {a: -1} ) Get All Records Sorted By Property ‘a’ In Reverse >
  • 30. Some Common Questions
  • 31. So I should chose MongoDB over MySQL? • Bad Question! • 90% of the time you’ll probably implement a hybrid system.
  • 32. When should I use MongoDB? • When an ORM is necessary • It’s in the name, Object-Relational Mapper • When you use a metric ton of 1:n and n:m tables to generate 1 object • And you rarely if ever use them for reporting
  • 33. MongoDB performance is better? • Too simple of a question • Performance comparable, MySQL usually wins sheer query speed • Sterilized Lab Test • MongoDB usually wins due to fewer queries required and no object reassembly • The Real World
  • 34. Can MongoDB enforce a schema? • You can add indexes on arbitrary keypatterns • Otherwise, why? • Application is responsible for correctness and error handling, no need to duplicate
  • 35. Can I trust eventual consistency? • No, but you shouldn’t trust ACID either • Build your application to be flexible and to handle consistency issues • Stale data is a fact of life
  • 36. Can MongoDB Handle Large Deployments? 1.2 TB over 5 billion records 600+ million documents Migrating ENTIRE app from Postgres http://www.mongodb.org/display/DOCS/Production+Deployments
  • 37. Can MongoDB Handle Large Deployments? • huMONGOousDB • 32-bit only supports ~2.5GB • Memory-mapped files • Individual documents limited to 4MB
  • 38. Why waste time with theory?
  • 40. The ‘Have Done’ http://orthochronos.com
  • 41. Very Simple • Daemon does insertion in the background • Front end just does simple grabs • Grab 1, Grab Many, etc.
  • 42. Data Model • System has Stocks • Each Stock has Daily (once per day) and IntraDaily (every 15 minutes) data • Limited to trading hours • Each set of data (daily or intradaily) has 4 Graphs • Each graph has upwards of 6 Lines • Each line has between 300 to 800 Points
  • 43. Data Model • With each data point representing 1 minute, each 15-minute IntraDay graph will have about 785 overlapping points with the preceding graph • Why not consolidate into a single running table, and just SELECT ... LIMIT 800 points from X timestamp?
  • 44. Data Model • The Algorithm will cause past points to change • But each graph should be preserved so one can see the historical evolution of a given curve
  • 45. Data Model • Now imagine implementing these requirements in a traditional RDBMS
  • 46. Data Model • Instead, lets see my MongoDB implementation
  • 47. Database Layout ~/MongoDB/bin $ ./mongo MongoDB shell version: 1.6.0 connecting to: test > use DBNAME switched to db DBNAME > show collections aapl aapl.daily aapl.intraday acas acas.daily acas.intraday ... wfr wfr.daily wfr.intraday x x.daily x.intraday
  • 48. Stock ‘Metadata’ { "_id" : ObjectId("4c5a15038ead0eec04000000"), "timestamp" : 1228995000, "data" : [ "0", "92", "0" ] }
  • 49. Interval Data { "_id" : ObjectId("4bb901cc8ead0e041a0d0000"), "timestamp" : 1228994100, "number" : 3, "charts" : [ { "number" : "1", "data" : [ -99999, -99999, -99999, -99999, -99999 ], "domainLength" : 300, "domainDates" : [ "Tue Nov 25 10:45:00 GMT-0500 2008", ... ], "lines" : [ { "first" : 76, "last" : 300, "points" : [ { "index" : 1, "value" : 0 }, { ... } ] }, { ... }, { ... }, { ... }, { ... }, { ... } ] }, { ... }, { ... }, { ... } ] }
  • 50. Connect /** * @return MongoDB */ protected static function _getDb() { static $db; if( !isset($db) ) { $mongo = new Mongo(); $db = $mongo->selectDb('...db name...'); } return $db; }
  • 51. SELECT TOP 1... //... $db = self::_getDb(); $collection = $db->selectCollection(strtolower($symbol)); $dti = $collection->find() ->sort(array( 'timestamp' => -1 )) ->limit(1) ->getNext(); //...
  • 52. Get Specific Timestamp //... $tstamp = strtotime($lastTimestamp); $cur = $collection->find(array( 'timestamp' => $tstamp )) ->limit(1); //...
  • 53. Only Get Timestamps //... $dailyCollection = $db->selectCollection(strtolower($symbol).'.daily'); $dailyCur = $dailyCollection->find(array(), array('timestamp')) ->sort(array( 'timestamp' => 1 )); foreach( $dailyCur as $timestamp ) { //... } //...
  • 54. Utilizing Collections //... $db = self::_getDb(); $stocks = array(); foreach( $db->listCollections() as $collection ) { $collection_name = strtolower($collection->getName()); if( preg_match('#^[a-z0-9]+$#i', $collection_name) ) { $collection_name = strtoupper($collection_name); $stocks[] = $collection_name; } } sort($stocks); //...
  • 55. The ‘Wish I Did’
  • 56. Not Too Terrible • Keep track of Student Cases • A case keeps track of demographics, diagnoses, disabilities, notes, schedule, etc. • Also tracks Exams • Schedule multiple exams per course • Finally, students can log into a portal, counselors can log in • Basic user management
  • 57. Mostly Static • Most information display only • Even with reporting
  • 58. So I used good old fashioned RDBMS design...
  • 60. Instead... • A collection for Student Cases • A collection for Courses • etc... • Denormalize!
  • 62. { A Student Document "_id":ObjectId("4c5f572e8ead0ed00d0f0000"), "uin":"485596916", "disabilities":[ "firstname":"Zach", "Blind", "middleinitial":"I", "Asthma" "lastname":"Hill", ], "major":"ACCT", "casenotes":[ "classification":"G4", ... "registrationdate":"2008-03-09", { "renewaldates":[ "counselor":"Zander King", ... "note":"lorem ipsum bibendum enim ..." { } "semester":"201021", ], "date":"2010-05-15" "schedules":[ } { ], "semester":"201021", "localaddress":{ "courses":[ "street":"9116 View", ... "city":"College Station", { "state":"TX", "$ref":"courses", "zip":"77840" "$id":ObjectId(...) }, } "permanentaddress":{ ] "street":"3960 Lake", } "city":"Los Angeles", ] "state":"CA", } "zip":"90001" },
  • 63. A Course Document { "_id" : ObjectId("4c5f572e8ead0ed00d000000"), "subject" : "MECH", "course" : 175, "section" : 506, "faculty" : "Dr. Quagmire Hill" }
  • 65. MongoDB DBRef • Similar to FK • Requires driver support • Not query-able
  • 66. But can we do everything we need?
  • 67. Paginate? > db.students.find( {}, {_id:1} ).skip( 20 ).limit( 10 ) { "_id" : ObjectId("4c5f572e8ead0ed00d230000") } { "_id" : ObjectId("4c5f572e8ead0ed00d240000") } { "_id" : ObjectId("4c5f572e8ead0ed00d250000") } { "_id" : ObjectId("4c5f572e8ead0ed00d260000") } { "_id" : ObjectId("4c5f572e8ead0ed00d270000") } { "_id" : ObjectId("4c5f572e8ead0ed00d280000") } { "_id" : ObjectId("4c5f572e8ead0ed00d290000") } { "_id" : ObjectId("4c5f572e8ead0ed00d2a0000") } { "_id" : ObjectId("4c5f572e8ead0ed00d2b0000") } { "_id" : ObjectId("4c5f572e8ead0ed00d2c0000") } >
  • 68. Only Renewed Once? > db.students.find( {renewaldates: {$size: 1}}, {_id:1, renewaldates:1} ) { "_id" : ObjectId("4c5f572e8ead0ed00d150000"), "renewaldates" : [ { "semester" : "201031", "date" : "2010-08-16" } ] } { "_id" : ObjectId("4c5f572e8ead0ed00d190000"), "renewaldates" : [ { "semester" : "201031", "date" : "2010-08-16" } ] } { "_id" : ObjectId("4c5f572e8ead0ed00d440000"), "renewaldates" : [ { "semester" : "201021", "date" : "2010-05-15" } ] } { "_id" : ObjectId("4c5f572e8ead0ed00d460000"), "renewaldates" : [ { "semester" : "201031", "date" : "2010-08-16" } ] } { "_id" : ObjectId("4c5f572e8ead0ed00d4e0000"), "renewaldates" : [ { "semester" : "201031", "date" : "2010-08-16" } ] } { "_id" : ObjectId("4c5f572e8ead0ed00d660000"), "renewaldates" : [ { "semester" : "201031", "date" : "2010-08-16" } ] } { "_id" : ObjectId("4c5f572e8ead0ed00d6f0000"), "renewaldates" : [ { "semester" : "201031", "date" : "2010-08-16" } ] } { "_id" : ObjectId("4c5f572e8ead0ed00d720000"), "renewaldates" : [ { "semester" : "201031", "date" : "2010-08-16" } ] } { "_id" : ObjectId("4c5f572e8ead0ed00d750000"), "renewaldates" : [ { "semester" : "201031", "date" : "2010-08-16" } ] } { "_id" : ObjectId("4c5f572e8ead0ed00d800000"), "renewaldates" : [ { "semester" : "201021", "date" : "2010-05-15" } ] } { "_id" : ObjectId("4c5f572e8ead0ed00d880000"), "renewaldates" : [ { "semester" : "201031", "date" : "2010-08-16" } ] } { "_id" : ObjectId("4c5f572e8ead0ed00d8d0000"), "renewaldates" : [ { "semester" : "201021", "date" : "2010-05-15" } ] } { "_id" : ObjectId("4c5f572e8ead0ed00d8f0000"), "renewaldates" : [ { "semester" : "201031", "date" : "2010-08-16" } ] } { "_id" : ObjectId("4c5f572e8ead0ed00d940000"), "renewaldates" : [ { "semester" : "201031", "date" : "2010-08-16" } ] } >
  • 69. How Many Blind Asthmatics? > db.students.find( {disabilities: {$in: ['Blind', 'Asthma']}} ).count() 76 >
  • 70. Who Lives On Elm? > db.students.find( {'localaddress.street': /.*Elm/}, {_id:1, 'localaddress.street':1} ) { "_id" : ObjectId("4c5f572e8ead0ed00d220000"), "localaddress" : { "street" : "2807 Elm" } } { "_id" : ObjectId("4c5f572e8ead0ed00d290000"), "localaddress" : { "street" : "5762 Elm" } } { "_id" : ObjectId("4c5f572e8ead0ed00d400000"), "localaddress" : { "street" : "6261 Elm" } } { "_id" : ObjectId("4c5f572e8ead0ed00d610000"), "localaddress" : { "street" : "7099 Elm" } } { "_id" : ObjectId("4c5f572e8ead0ed00d930000"), "localaddress" : { "street" : "4994 Elm" } } { "_id" : ObjectId("4c5f572e8ead0ed00d960000"), "localaddress" : { "street" : "3456 Elm" } } >
  • 71. Number Registered On Or Before 2010/06/20? > db.students .find( {$where: "new Date(this.registrationdate) <= new Date('2010/06/20')"} ) .count() 136 >
  • 72. Update Classification? > db.students.find( {uin:'735383393'}, {_id: 1, uin: 1, classification: 1} ) { "_id" : ObjectId("4c60dae48ead0e143e0f0000"), "uin" : "735383393", "classification" : "G2" } > db.students.update( {uin:'735383393'}, {$set: {classification: 'G3'}} ) > db.students.find( {uin:'735383393'}, {_id: 1, uin: 1, classification: 1} ) { "_id" : ObjectId("4c60dae48ead0e143e0f0000"), "uin" : "735383393", "classification" : "G3" } >
  • 73. Home Towns? > db.students.distinct('permanentaddress.city') [ "Austin", "Chicago", "Dallas", "Denver", "Houston", "Los Angeles", "Lubbock", "New York" ] >
  • 74. Number of students by major? > db.students .group({ key: {major:true}, cond: {major: {$exists: true}}, reduce: function(obj, prev) { prev.count += 1; }, initial: { count: 0 } }) [ {"major" : "CPSC", "count" : 12}, {"major" : "MECH", "count" : 16}, {"major" : "ACCT", "count" : 18}, {"major" : "MGMT", "count" : 18}, {"major" : "FINC", "count" : 16}, {"major" : "ENDS", "count" : 15}, {"major" : "ARCH", "count" : 18}, {"major" : "ENGL", "count" : 15}, {"major" : "POLS", "count" : 22} ] >
  • 75. How Many Students In A Course? > db.students .find({'schedules.courses': {$in: [ new DBRef('courses', new ObjectId('4c60dae48ead0e143e000000')) ]}}) .count() 25
  • 76. No Time To Cover... • Map-Reduce • MongoDB has it, and it is extremely powerful • GridFS • Store files/blobs • Sharding/Replica Pairs/Master-Slave
  • 77. Q&A
  • 78. Resources • BASE: An Acid Alternative http://queue.acm.org/detail.cfm?id=1394128 • PHP Mongo Driver Reference http://php.net/mongo • MongoDB Advance Query Reference http://www.mongodb.org/display/DOCS/Advanced+Queries • MongoDB Query Cheat Sheet http://www.10gen.com/reference • myNoSQL Blog http://nosql.mypopescu.com/

Editor's Notes