SlideShare a Scribd company logo
1 of 68
Download to read offline
NoSQL: Introduction



              Asya Kamsky
                            1
• 1970's Relational Databases Invented
  – Storage is expensive
  – Data is normalized
  – Data storage is abstracted away from app




                                               2
• 1970's Relational Databases Invented
  – Storage is expensive
  – Data is normalized
  – Data storage is abstracted away from app
• 1980's RDBMS commercialized
  – Client/Server model
  – SQL becomes the standard




                                               3
• 1970's Relational Databases Invented
  – Storage is expensive
  – Data is normalized
  – Data storage is abstracted away from app
• 1980's RDBMS commercialized
  – Client/Server model
  – SQL becomes the standard
• 1990's Things begin to change
  – Client/Server=> 3-tier architecture
  – Rise of the Internet and the Web
                                               4
• 2000's Web 2.0
  –   Rise of "Social Media"
  –   Acceptance of E-Commerce
  –   Constant decrease of HW prices
  –   Massive increase of collected data




                                           5
• 2000's Web 2.0
  –   Rise of "Social Media"
  –   Acceptance of E-Commerce
  –   Constant decrease of HW prices
  –   Massive increase of collected data




• Result
  – Constant need to scale dramatically
  – How can we scale?                      6
Computers in 1985
• x286 5-35 mhz
• 56 kbps
• 64 KB RAM
• 10 MB HDD




                    7
Computers in 1985 Computers in 1995
• x286 5-35 mhz   • Pentium 100 mhz
• 56 kbps         • 20-50 Mbps
• 64 KB RAM       • 16 MB RAM
• 10 MB HDD       • 200 MB HDD




                                      8
Computers in 1985 Computers in 1995   Phone in 2012
• x286 5-35 mhz   • Pentium 100 mhz   • Dual core 1.2 Ghz
• 56 kbps         • 20-50 Mbps        • WiFi 802.11n -
• 64 KB RAM       • 16 MB RAM           300+Mbps
• 10 MB HDD       • 200 MB HDD        • 1 GB RAM
                                      • 48 GB SSD




                                                            9
Computers in 1985 Computers in 1995   Computers in 2012
• x286 5-35 mhz   • Pentium 100 mhz   • Dual core 1.8 Ghz
• 56 kbps         • 20-50 Mbps        • WiFi 802.11n -
• 64 KB RAM       • 16 MB RAM           300+Mbps
• 10 MB HDD       • 200 MB HDD        • 180+ Gbps
                                      • 8 GB RAM
                                      • 512 GB SSD




                                                            10
• Agile Development
  Methodology
   • Shorter development cycles
   • Constant evolution of
     requirements
   • Flexibility at design time




                                  11
• Agile Development
  Methodology
   • Shorter development cycles
   • Constant evolution of
     requirements
   • Flexibility at design time



                                  • Relational Schema
                                     • Hard to evolve
                                         • long painful migrations
                                         • must stay in sync with
                                           application
                                     • few developers interact directly

                                                                          12
+ complex transactions
+ ad hoc queries                                            + tabular data
+ SQL standard                                              + ad hoc queries
protocol between                                            - O<->R mapping hard
clients and servers                                         - speed/scale problems
+ scales horizontally                                       - not super agile
better than oper dbs.
- some scale limits at
massive scale                        BI /       OLTP /
- schemas are rigid               reporting   operational
- no real time; great at
bulk nightly data loads




                                                                a lot more
                   fewer issues                                issues here
                                                                                     13
                      here
+ complex transactions
+ ad hoc queries                                                      + tabular data
+ SQL standard                                                        + ad hoc queries
protocol between                                                      - O<->R mapping hard
clients and servers                                                   - speed/scale problems
+ scales horizontally                                                 - not super agile
better than oper dbs.
- some scale limits at
massive scale                     BI /       OLTP /
- schemas are rigid            reporting   operational
- no real time; great at                                              caching
bulk nightly data loads



                                                                                 app layer
                                                         flat files             partitioning
                           map/reduce




                                                                                               14
15
• Agile Development
  Methodology
   • Shorter development cycles
   • Constant evolution of
     requirements
   • Flexibility at design time




                                  16
• Agile Development
  Methodology
   • Shorter development cycles
   • Constant evolution of
     requirements
   • Flexibility at design time



                                  • Relational Schema
                                     • Hard to evolve
                                         • long painful migrations
                                         • must stay in sync with
                                           application
                                     • few developers interact directly

                                                                          17
18
•   Horizontal scaling
•   Run anywhere
•   Flexible data model
•   Faster development
•   Low upfront cost
•   Low cost of ownership



                            19
What is NoSQL?


           Relational
                   vs
       Non-Relational
                        20
+ speed and scale
                                   - ad hoc query limited
                                   - not very transactional
                                   - no sql/no standard
                                   + fits OO well
                     scalable      + agile
                   nonrelational
BI / reporting       ("nosql")




            OLTP /
          operational




                                                              21
Non-relational next generation
     operation data stores and databases

A collection of very different products
•   Different data models (Not relational)
•   Most are not using SQL for queries
•   No predefined schema
•   Some allow flexible data structures

                                             22
• Relational   •   Key-Value
               •   Document
               •   XML
               •   Graph
               •   Column




                               23
• Relational   •   Key-Value
               •   Document
               •   XML
               •   Graph
               •   Column

• ACID         • BASE
               • Some ACID properties




                                        24
• Relational         •   Key-Value
                     •   Document
                     •   XML
                     •   Graph
                     •   Column

• ACID               • BASE
                     • Some ACID properties

• Two-phase commit   • Atomic transactions on
                       document level
                                                25
• Relational         •   Key-Value
                     •   Document
                     •   XML
                     •   Graph
                     •   Column

• ACID               • BASE
                     • Some ACID properties

• Two-phase commit   • Atomic transactions on
                       document level
• Joins              • No Joins                 26
27
• Fits your use case

• Reliability

• Maintainability

• Ease of Use

• Scalability

• Cost
                       28
MongoDB: Introduction




                        29
30
• Designed and developed by founders of Doubleclick,
  ShopWiki, GILT groupe, etc.
• GOAL: create high performance, fully consistent,
  horizonally scalable general purpose data store.


• Coding started fall 2007
• Open Source – AGPL, written in C++
• First production site March 2008 - businessinsider.com
• Currently version 2.2 – August 2012

                                                           31
MongoDB
Design Goals
               32
33
• Document-oriented
  Storage
   • Based on JSON
     Documents
   • Data serialized to BSON
   • Flexible Schema
• Scalable Architecture
   • Replication
   • High availability         • Key Features Include:
   • Auto-sharding                •   Full featured indexes
   • Extensive use of memory      •   Ad-hoc Query Language
     mapped files                 •   Interactive shell
   • Durable                      •   Aggregation queries
   • Strong Consistency           •   Map/Reduce
                                                              34
• Rich data models
• Seamlessly map to native programming
  language types
• Flexible for dynamic data
• Better data locality




                                         35
Blogging website:
  Register users
  Users post blog entries
  Comment on others' entries
  Considering:
      Tagging, Voting, ???


                               36
join
table




        37
{
    _id : ObjectId("4e2e3f92268cdda473b628f6"),
    title : "My Very Important Thoughts",
    published: ISODate("2011-07-26T19:49:00.147Z"),
    author : { name:"Asya Kamsky", username:"asya" },
    text : "It was a long and stormy night ..."
}




                                                        38
{
    _id : ObjectId("4e2e3f92268cdda473b628f6"),
    title : "My Very Important Thoughts",
    published: ISODate("2011-07-26T19:49:00.147Z"),
    author : { name:"Asya Kamsky", username:"asya" },
    text : "It was a long and stormy night ..."
    tags : ["business", "news", "north america"]
}

> db.posts.ensureIndex( { tags : 1 } )




                                                        39
{
    _id : ObjectId("4e2e3f92268cdda473b628f6"),
    title : "My Very Important Thoughts",
    published: ISODate("2011-07-26T19:49:00.147Z"),
    author : { name:"Asya Kamsky", username:"asya" },
    text : "It was a long and stormy night ..."
    tags : ["business", "news", "north america"]
}

> db.posts.find( { tags : "news" } )




                                                        40
{
    _id : ObjectId("4e2e3f92268cdda473b628f6"),
    title : "My Very Important Thoughts",
    published: ISODate("2011-07-26T19:49:00.147Z"),
    author : { name:"Asya Kamsky", username:"asya" },
    text : "It was a long and stormy night ..."
    tags : ["business", "news", "north america"]
}

> db.posts.find( { tags : "news" } ) .explain()
{   "cursor" : "BtreeCursor tags_1",
    "isMultiKey" : true,
    "n" : 1,
    "nscannedObjects" : 1,
    "scanAndOrder" : false,
    "indexOnly" : false,                                41
{
    _id : ObjectId("4e2e3f92268cdda473b628f6"),
    title : "My Very Important Thoughts",
    published: ISODate("2011-07-26T19:49:00.147Z"),
    author : { name:"Asya Kamsky", username:"asya" },
    text : "It was a long and stormy night ..."
    tags : ["business", "news", "north america"],
    votes : 3,
    voters : ["dmerr", "sj", "jane" ]
}

> db.posts.update( { }, – query for documents to update
                   { } – update to perform
                 )

                                                        42
{
    _id : ObjectId("4e2e3f92268cdda473b628f6"),
    title : "My Very Important Thoughts",
    published: ISODate("2011-07-26T19:49:00.147Z"),
    author : { name:"Asya Kamsky", username:"asya" },
    text : "It was a long and stormy night ..."
    tags : ["business", "news", "north america"],
    votes : 3,
    voters : ["dmerr", "sj", "jane" ]
}

> db.posts.update( {_id:..., voters:{$ne:"asya"} },
                   { $push: {voters:"asya"},
                     $inc : {votes: 1}
                   } )
                                                        43
{
    _id : ObjectId("4e2e3f92268cdda473b628f6"),
    title : "My Very Important Thoughts",
    published: ISODate("2011-07-26T19:49:00.147Z"),
    author : { name:"Asya Kamsky", username:"asya" },
    text : "It was a long and stormy night ..."
    tags : ["business", "news", "north america"],
    votes : 4,
    voters : ["dmerr", "sj", "jane", "asya" ],
    comments : [
       { by : "tim157", text : "great story", ... },
       { by : "gora", text : "i don’t think so", ... },
       { by : "dmerr", text : "also check out..." }
    ]
}
                                                          44
{
    _id : ObjectId("4e2e3f92268cdda473b628f6"),
    title : "My Very Important Thoughts",
    published: ISODate("2011-07-26T19:49:00.147Z"),
    author : { name:"Asya Kamsky", username:"asya" },
    text : "It was a long and stormy night ..."
    tags : ["business", "news", "north america"],
    votes : 4,
    voters : ["dmerr", "sj", "jane","asya" ],
    comments : [
       { by : "tim157", text : "great story" },
       { by : "gora", text : "i don’t think so" },
       { by : "dmerr", text : "also check out..." }
    ]
}

> db.posts.ensureIndex( { "comments.by" : 1 } )
                                                        45
{
    _id : ObjectId("4e2e3f92268cdda473b628f6"),
    title : "My Very Important Thoughts",
    published: ISODate("2011-07-26T19:49:00.147Z"),
    author : { name:"Asya Kamsky", username:"asya" },
    text : "It was a long and stormy night ..."
    tags : ["business", "news", "north america"],
    votes : 4,
    voters : ["dmerr", "sj", "jane","asya" ],
    comments : [
       { by : "tim157", text : "great story" },
       { by : "gora", text : "i don’t think so" },
       { by : "dmerr", text : "also check out..." }
    ]
}

> db.posts.find( { "comments.by" : "gora" } )
                                                        46
Seek = 5+ ms          Read = really really fast




               Post

                              Comment
Author




                                                  47
Disk seeks and data locality


    Post


      Author


      Comment
       Comment
        Comment
         Comment
           Comment




                               48
• High Availability
• Data Redundancy
• Increase capacity with no downtime
• Transparent to the application




                                       49
•   A cluster of N servers              Pick me!

•   Any (one) node can be primary
•   All writes to primary            Node 1



•   Reads go to primary (default)
                                     Node 2
    optionally to a secondary
• Consensus election of primary     Primary
                                     Node 3

• Automatic failover
• Automatic recovery
                                                   50
Replica Sets
• High Availability/Automatic Failover
• Data Redundancy
• Disaster Recovery
• Transparent to the application
• Perform maintenance with no down time




                                          51
Asynchronous
Replication




          52
Asynchronous
Replication




          53
Asynchronous
Replication




          54
55
Automatic
 Election




            56
57
•   Increase capacity with no downtime
•   Transparent to the application
•   Range based partitioning
•   Partitioning and balancing is automatic




                                              58
Key Range    Key Range    Key Range   Key Range
 min..25      26..50       51..75      76.. max


Primary      Primary      Primary      Primary


Secondary    Secondary    Secondary    Secondary


Secondary    Secondary    Secondary    Secondary


                                                   59
Application



                 MongoS




 Key Range    Key Range         Key Range   Key Range
 min..25      26..50            51..75      76.. max


Primary      Primary           Primary      Primary


Secondary    Secondary         Secondary    Secondary


Secondary    Secondary         Secondary    Secondary


                                                        60
Application



          MongoS       MongoS        MongoS




 Key Range          Key Range         Key Range   Key Range
 min..25            26..50            51..75      76.. max


Primary            Primary           Primary      Primary


Secondary          Secondary         Secondary    Secondary


Secondary          Secondary         Secondary    Secondary


                                                              61
Application        Application
                                 Application      Application


                      MongoS                                            Config
MongoS                                                                  Config
                                 MongoS         MongoS                   Config
            MongoS




          Key Range         Key Range          Key Range        Key Range
          min..25           26..50             51..75           76.. max


         Primary           Primary         Primary          Primary


         Secondary         Secondary       Secondary        Secondary


         Secondary         Secondary       Secondary        Secondary

                                                                                  62
• Few configuration options
• Does the right thing out of the box
• Easy to deploy and manage




                                        63
Better data locality                 In-Memory                                        Auto-Sharding
                                      Caching




                                                                       Read scaling
                                                                                          Write scaling

 Relational   MongoDB




                        We just can't get any faster than the way MongoDB handles our data.
                                                                                                   Tony Tam
                                                                                                CTO, Wordnik

                                                                                                               64
• Supported Platforms:

  – Linux, Windows, Solaris, Mac OS X



  – Packages available for all popular distributions




  No external/third party software dependencies
  10gen maintains drivers for over dozen languages
                                                       65
Content Management       Operational Intelligence           E-Commerce




            User Data Management         High Volume Data Feeds




                                                                         66
67
Open source, high performance database




                                         68

More Related Content

What's hot

NoSQL Slideshare Presentation
NoSQL Slideshare Presentation NoSQL Slideshare Presentation
NoSQL Slideshare Presentation Ericsson Labs
 
Big Data: Guidelines and Examples for the Enterprise Decision Maker
Big Data: Guidelines and Examples for the Enterprise Decision MakerBig Data: Guidelines and Examples for the Enterprise Decision Maker
Big Data: Guidelines and Examples for the Enterprise Decision MakerMongoDB
 
MongoDB Pros and Cons
MongoDB Pros and ConsMongoDB Pros and Cons
MongoDB Pros and Consjohnrjenson
 
MongoDB vs Mysql. A devops point of view
MongoDB vs Mysql. A devops point of viewMongoDB vs Mysql. A devops point of view
MongoDB vs Mysql. A devops point of viewPierre Baillet
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBRavi Teja
 
MongoDB: An Introduction - june-2011
MongoDB:  An Introduction - june-2011MongoDB:  An Introduction - june-2011
MongoDB: An Introduction - june-2011Chris Westin
 
Mongo DB: Operational Big Data Database
Mongo DB: Operational Big Data DatabaseMongo DB: Operational Big Data Database
Mongo DB: Operational Big Data DatabaseXpand IT
 
Conceptos básicos. Seminario web 1: Introducción a NoSQL
Conceptos básicos. Seminario web 1: Introducción a NoSQLConceptos básicos. Seminario web 1: Introducción a NoSQL
Conceptos básicos. Seminario web 1: Introducción a NoSQLMongoDB
 
NOSQL Databases types and Uses
NOSQL Databases types and UsesNOSQL Databases types and Uses
NOSQL Databases types and UsesSuvradeep Rudra
 
When to Use MongoDB
When to Use MongoDBWhen to Use MongoDB
When to Use MongoDBMongoDB
 
When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...MongoDB
 
MongoDB- Crud Operation
MongoDB- Crud OperationMongoDB- Crud Operation
MongoDB- Crud OperationEdureka!
 
Benefits of Using MongoDB Over RDBMSs
Benefits of Using MongoDB Over RDBMSsBenefits of Using MongoDB Over RDBMSs
Benefits of Using MongoDB Over RDBMSsMongoDB
 
Non relational databases-no sql
Non relational databases-no sqlNon relational databases-no sql
Non relational databases-no sqlRam kumar
 
NoSQL databases and managing big data
NoSQL databases and managing big dataNoSQL databases and managing big data
NoSQL databases and managing big dataSteven Francia
 

What's hot (20)

NoSQL Slideshare Presentation
NoSQL Slideshare Presentation NoSQL Slideshare Presentation
NoSQL Slideshare Presentation
 
Big Data: Guidelines and Examples for the Enterprise Decision Maker
Big Data: Guidelines and Examples for the Enterprise Decision MakerBig Data: Guidelines and Examples for the Enterprise Decision Maker
Big Data: Guidelines and Examples for the Enterprise Decision Maker
 
MongoDB Pros and Cons
MongoDB Pros and ConsMongoDB Pros and Cons
MongoDB Pros and Cons
 
MongoDB vs Mysql. A devops point of view
MongoDB vs Mysql. A devops point of viewMongoDB vs Mysql. A devops point of view
MongoDB vs Mysql. A devops point of view
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
MongoDB: An Introduction - june-2011
MongoDB:  An Introduction - june-2011MongoDB:  An Introduction - june-2011
MongoDB: An Introduction - june-2011
 
Mongo db
Mongo dbMongo db
Mongo db
 
Mongo DB: Operational Big Data Database
Mongo DB: Operational Big Data DatabaseMongo DB: Operational Big Data Database
Mongo DB: Operational Big Data Database
 
Conceptos básicos. Seminario web 1: Introducción a NoSQL
Conceptos básicos. Seminario web 1: Introducción a NoSQLConceptos básicos. Seminario web 1: Introducción a NoSQL
Conceptos básicos. Seminario web 1: Introducción a NoSQL
 
Introduction to mongodb
Introduction to mongodbIntroduction to mongodb
Introduction to mongodb
 
NOSQL vs SQL
NOSQL vs SQLNOSQL vs SQL
NOSQL vs SQL
 
NOSQL Databases types and Uses
NOSQL Databases types and UsesNOSQL Databases types and Uses
NOSQL Databases types and Uses
 
When to Use MongoDB
When to Use MongoDBWhen to Use MongoDB
When to Use MongoDB
 
When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...
 
MongoDB- Crud Operation
MongoDB- Crud OperationMongoDB- Crud Operation
MongoDB- Crud Operation
 
Benefits of Using MongoDB Over RDBMSs
Benefits of Using MongoDB Over RDBMSsBenefits of Using MongoDB Over RDBMSs
Benefits of Using MongoDB Over RDBMSs
 
Non relational databases-no sql
Non relational databases-no sqlNon relational databases-no sql
Non relational databases-no sql
 
NoSQL databases and managing big data
NoSQL databases and managing big dataNoSQL databases and managing big data
NoSQL databases and managing big data
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Relational vs. Non-Relational
Relational vs. Non-RelationalRelational vs. Non-Relational
Relational vs. Non-Relational
 

Similar to Intro to NoSQL and MongoDB

Intro to NoSQL and MongoDB
Intro to NoSQL and MongoDBIntro to NoSQL and MongoDB
Intro to NoSQL and MongoDBDATAVERSITY
 
Big data hadoop-no sql and graph db-final
Big data hadoop-no sql and graph db-finalBig data hadoop-no sql and graph db-final
Big data hadoop-no sql and graph db-finalramazan fırın
 
NOSQL, CouchDB, and the Cloud
NOSQL, CouchDB, and the CloudNOSQL, CouchDB, and the Cloud
NOSQL, CouchDB, and the Cloudboorad
 
DevNation Atlanta
DevNation AtlantaDevNation Atlanta
DevNation Atlantaboorad
 
What ya gonna do?
What ya gonna do?What ya gonna do?
What ya gonna do?CQD
 
ROMA User-Customizable NoSQL Database in Ruby
ROMA User-Customizable NoSQL Database in RubyROMA User-Customizable NoSQL Database in Ruby
ROMA User-Customizable NoSQL Database in RubyRakuten Group, Inc.
 
Why NoSQL Makes Sense
Why NoSQL Makes SenseWhy NoSQL Makes Sense
Why NoSQL Makes SenseDATAVERSITY
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...Qian Lin
 
Big Data! Great! Now What? #SymfonyCon 2014
Big Data! Great! Now What? #SymfonyCon 2014Big Data! Great! Now What? #SymfonyCon 2014
Big Data! Great! Now What? #SymfonyCon 2014Ricard Clau
 
Intro to Big Data and NoSQL
Intro to Big Data and NoSQLIntro to Big Data and NoSQL
Intro to Big Data and NoSQLDon Demcsak
 
North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911Ines Sombra
 
Non-Relational Databases at ACCU2011
Non-Relational Databases at ACCU2011Non-Relational Databases at ACCU2011
Non-Relational Databases at ACCU2011Gavin Heavyside
 
Webinar: How Banks Manage Reference Data with MongoDB
 Webinar: How Banks Manage Reference Data with MongoDB Webinar: How Banks Manage Reference Data with MongoDB
Webinar: How Banks Manage Reference Data with MongoDBMongoDB
 
Kb 40 kevin_klineukug_reading20070717[1]
Kb 40 kevin_klineukug_reading20070717[1]Kb 40 kevin_klineukug_reading20070717[1]
Kb 40 kevin_klineukug_reading20070717[1]shuwutong
 
Oracle big data appliance and solutions
Oracle big data appliance and solutionsOracle big data appliance and solutions
Oracle big data appliance and solutionssolarisyougood
 
Big Data Paris : Hadoop and NoSQL
Big Data Paris : Hadoop and NoSQLBig Data Paris : Hadoop and NoSQL
Big Data Paris : Hadoop and NoSQLTugdual Grall
 
Why Every NoSQL Deployment Should Be Paired with Hadoop Webinar
Why Every NoSQL Deployment Should Be Paired with Hadoop WebinarWhy Every NoSQL Deployment Should Be Paired with Hadoop Webinar
Why Every NoSQL Deployment Should Be Paired with Hadoop WebinarCloudera, Inc.
 
001 hbase introduction
001 hbase introduction001 hbase introduction
001 hbase introductionScott Miao
 
Moving On Up - smaller servers and bigger performance
Moving On Up - smaller servers and bigger performanceMoving On Up - smaller servers and bigger performance
Moving On Up - smaller servers and bigger performanceDoug Lucy
 

Similar to Intro to NoSQL and MongoDB (20)

Intro to NoSQL and MongoDB
Intro to NoSQL and MongoDBIntro to NoSQL and MongoDB
Intro to NoSQL and MongoDB
 
Big data hadoop-no sql and graph db-final
Big data hadoop-no sql and graph db-finalBig data hadoop-no sql and graph db-final
Big data hadoop-no sql and graph db-final
 
NOSQL, CouchDB, and the Cloud
NOSQL, CouchDB, and the CloudNOSQL, CouchDB, and the Cloud
NOSQL, CouchDB, and the Cloud
 
DevNation Atlanta
DevNation AtlantaDevNation Atlanta
DevNation Atlanta
 
What ya gonna do?
What ya gonna do?What ya gonna do?
What ya gonna do?
 
ROMA User-Customizable NoSQL Database in Ruby
ROMA User-Customizable NoSQL Database in RubyROMA User-Customizable NoSQL Database in Ruby
ROMA User-Customizable NoSQL Database in Ruby
 
Why NoSQL Makes Sense
Why NoSQL Makes SenseWhy NoSQL Makes Sense
Why NoSQL Makes Sense
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
 
Big Data! Great! Now What? #SymfonyCon 2014
Big Data! Great! Now What? #SymfonyCon 2014Big Data! Great! Now What? #SymfonyCon 2014
Big Data! Great! Now What? #SymfonyCon 2014
 
Intro to Big Data and NoSQL
Intro to Big Data and NoSQLIntro to Big Data and NoSQL
Intro to Big Data and NoSQL
 
North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911
 
Non-Relational Databases at ACCU2011
Non-Relational Databases at ACCU2011Non-Relational Databases at ACCU2011
Non-Relational Databases at ACCU2011
 
Webinar: How Banks Manage Reference Data with MongoDB
 Webinar: How Banks Manage Reference Data with MongoDB Webinar: How Banks Manage Reference Data with MongoDB
Webinar: How Banks Manage Reference Data with MongoDB
 
Intro to Big Data
Intro to Big DataIntro to Big Data
Intro to Big Data
 
Kb 40 kevin_klineukug_reading20070717[1]
Kb 40 kevin_klineukug_reading20070717[1]Kb 40 kevin_klineukug_reading20070717[1]
Kb 40 kevin_klineukug_reading20070717[1]
 
Oracle big data appliance and solutions
Oracle big data appliance and solutionsOracle big data appliance and solutions
Oracle big data appliance and solutions
 
Big Data Paris : Hadoop and NoSQL
Big Data Paris : Hadoop and NoSQLBig Data Paris : Hadoop and NoSQL
Big Data Paris : Hadoop and NoSQL
 
Why Every NoSQL Deployment Should Be Paired with Hadoop Webinar
Why Every NoSQL Deployment Should Be Paired with Hadoop WebinarWhy Every NoSQL Deployment Should Be Paired with Hadoop Webinar
Why Every NoSQL Deployment Should Be Paired with Hadoop Webinar
 
001 hbase introduction
001 hbase introduction001 hbase introduction
001 hbase introduction
 
Moving On Up - smaller servers and bigger performance
Moving On Up - smaller servers and bigger performanceMoving On Up - smaller servers and bigger performance
Moving On Up - smaller servers and bigger performance
 

More from DATAVERSITY

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...DATAVERSITY
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceDATAVERSITY
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data LiteracyDATAVERSITY
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsDATAVERSITY
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for YouDATAVERSITY
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?DATAVERSITY
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?DATAVERSITY
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling FundamentalsDATAVERSITY
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectDATAVERSITY
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at ScaleDATAVERSITY
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?DATAVERSITY
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...DATAVERSITY
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?DATAVERSITY
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsDATAVERSITY
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayDATAVERSITY
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise AnalyticsDATAVERSITY
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best PracticesDATAVERSITY
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?DATAVERSITY
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best PracticesDATAVERSITY
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageDATAVERSITY
 

More from DATAVERSITY (20)

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data Literacy
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for You
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling Fundamentals
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best Practices
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 

Recently uploaded

Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 

Recently uploaded (20)

Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 

Intro to NoSQL and MongoDB

  • 1. NoSQL: Introduction Asya Kamsky 1
  • 2. • 1970's Relational Databases Invented – Storage is expensive – Data is normalized – Data storage is abstracted away from app 2
  • 3. • 1970's Relational Databases Invented – Storage is expensive – Data is normalized – Data storage is abstracted away from app • 1980's RDBMS commercialized – Client/Server model – SQL becomes the standard 3
  • 4. • 1970's Relational Databases Invented – Storage is expensive – Data is normalized – Data storage is abstracted away from app • 1980's RDBMS commercialized – Client/Server model – SQL becomes the standard • 1990's Things begin to change – Client/Server=> 3-tier architecture – Rise of the Internet and the Web 4
  • 5. • 2000's Web 2.0 – Rise of "Social Media" – Acceptance of E-Commerce – Constant decrease of HW prices – Massive increase of collected data 5
  • 6. • 2000's Web 2.0 – Rise of "Social Media" – Acceptance of E-Commerce – Constant decrease of HW prices – Massive increase of collected data • Result – Constant need to scale dramatically – How can we scale? 6
  • 7. Computers in 1985 • x286 5-35 mhz • 56 kbps • 64 KB RAM • 10 MB HDD 7
  • 8. Computers in 1985 Computers in 1995 • x286 5-35 mhz • Pentium 100 mhz • 56 kbps • 20-50 Mbps • 64 KB RAM • 16 MB RAM • 10 MB HDD • 200 MB HDD 8
  • 9. Computers in 1985 Computers in 1995 Phone in 2012 • x286 5-35 mhz • Pentium 100 mhz • Dual core 1.2 Ghz • 56 kbps • 20-50 Mbps • WiFi 802.11n - • 64 KB RAM • 16 MB RAM 300+Mbps • 10 MB HDD • 200 MB HDD • 1 GB RAM • 48 GB SSD 9
  • 10. Computers in 1985 Computers in 1995 Computers in 2012 • x286 5-35 mhz • Pentium 100 mhz • Dual core 1.8 Ghz • 56 kbps • 20-50 Mbps • WiFi 802.11n - • 64 KB RAM • 16 MB RAM 300+Mbps • 10 MB HDD • 200 MB HDD • 180+ Gbps • 8 GB RAM • 512 GB SSD 10
  • 11. • Agile Development Methodology • Shorter development cycles • Constant evolution of requirements • Flexibility at design time 11
  • 12. • Agile Development Methodology • Shorter development cycles • Constant evolution of requirements • Flexibility at design time • Relational Schema • Hard to evolve • long painful migrations • must stay in sync with application • few developers interact directly 12
  • 13. + complex transactions + ad hoc queries + tabular data + SQL standard + ad hoc queries protocol between - O<->R mapping hard clients and servers - speed/scale problems + scales horizontally - not super agile better than oper dbs. - some scale limits at massive scale BI / OLTP / - schemas are rigid reporting operational - no real time; great at bulk nightly data loads a lot more fewer issues issues here 13 here
  • 14. + complex transactions + ad hoc queries + tabular data + SQL standard + ad hoc queries protocol between - O<->R mapping hard clients and servers - speed/scale problems + scales horizontally - not super agile better than oper dbs. - some scale limits at massive scale BI / OLTP / - schemas are rigid reporting operational - no real time; great at caching bulk nightly data loads app layer flat files partitioning map/reduce 14
  • 15. 15
  • 16. • Agile Development Methodology • Shorter development cycles • Constant evolution of requirements • Flexibility at design time 16
  • 17. • Agile Development Methodology • Shorter development cycles • Constant evolution of requirements • Flexibility at design time • Relational Schema • Hard to evolve • long painful migrations • must stay in sync with application • few developers interact directly 17
  • 18. 18
  • 19. Horizontal scaling • Run anywhere • Flexible data model • Faster development • Low upfront cost • Low cost of ownership 19
  • 20. What is NoSQL? Relational vs Non-Relational 20
  • 21. + speed and scale - ad hoc query limited - not very transactional - no sql/no standard + fits OO well scalable + agile nonrelational BI / reporting ("nosql") OLTP / operational 21
  • 22. Non-relational next generation operation data stores and databases A collection of very different products • Different data models (Not relational) • Most are not using SQL for queries • No predefined schema • Some allow flexible data structures 22
  • 23. • Relational • Key-Value • Document • XML • Graph • Column 23
  • 24. • Relational • Key-Value • Document • XML • Graph • Column • ACID • BASE • Some ACID properties 24
  • 25. • Relational • Key-Value • Document • XML • Graph • Column • ACID • BASE • Some ACID properties • Two-phase commit • Atomic transactions on document level 25
  • 26. • Relational • Key-Value • Document • XML • Graph • Column • ACID • BASE • Some ACID properties • Two-phase commit • Atomic transactions on document level • Joins • No Joins 26
  • 27. 27
  • 28. • Fits your use case • Reliability • Maintainability • Ease of Use • Scalability • Cost 28
  • 30. 30
  • 31. • Designed and developed by founders of Doubleclick, ShopWiki, GILT groupe, etc. • GOAL: create high performance, fully consistent, horizonally scalable general purpose data store. • Coding started fall 2007 • Open Source – AGPL, written in C++ • First production site March 2008 - businessinsider.com • Currently version 2.2 – August 2012 31
  • 33. 33
  • 34. • Document-oriented Storage • Based on JSON Documents • Data serialized to BSON • Flexible Schema • Scalable Architecture • Replication • High availability • Key Features Include: • Auto-sharding • Full featured indexes • Extensive use of memory • Ad-hoc Query Language mapped files • Interactive shell • Durable • Aggregation queries • Strong Consistency • Map/Reduce 34
  • 35. • Rich data models • Seamlessly map to native programming language types • Flexible for dynamic data • Better data locality 35
  • 36. Blogging website: Register users Users post blog entries Comment on others' entries Considering: Tagging, Voting, ??? 36
  • 38. { _id : ObjectId("4e2e3f92268cdda473b628f6"), title : "My Very Important Thoughts", published: ISODate("2011-07-26T19:49:00.147Z"), author : { name:"Asya Kamsky", username:"asya" }, text : "It was a long and stormy night ..." } 38
  • 39. { _id : ObjectId("4e2e3f92268cdda473b628f6"), title : "My Very Important Thoughts", published: ISODate("2011-07-26T19:49:00.147Z"), author : { name:"Asya Kamsky", username:"asya" }, text : "It was a long and stormy night ..." tags : ["business", "news", "north america"] } > db.posts.ensureIndex( { tags : 1 } ) 39
  • 40. { _id : ObjectId("4e2e3f92268cdda473b628f6"), title : "My Very Important Thoughts", published: ISODate("2011-07-26T19:49:00.147Z"), author : { name:"Asya Kamsky", username:"asya" }, text : "It was a long and stormy night ..." tags : ["business", "news", "north america"] } > db.posts.find( { tags : "news" } ) 40
  • 41. { _id : ObjectId("4e2e3f92268cdda473b628f6"), title : "My Very Important Thoughts", published: ISODate("2011-07-26T19:49:00.147Z"), author : { name:"Asya Kamsky", username:"asya" }, text : "It was a long and stormy night ..." tags : ["business", "news", "north america"] } > db.posts.find( { tags : "news" } ) .explain() { "cursor" : "BtreeCursor tags_1", "isMultiKey" : true, "n" : 1, "nscannedObjects" : 1, "scanAndOrder" : false, "indexOnly" : false, 41
  • 42. { _id : ObjectId("4e2e3f92268cdda473b628f6"), title : "My Very Important Thoughts", published: ISODate("2011-07-26T19:49:00.147Z"), author : { name:"Asya Kamsky", username:"asya" }, text : "It was a long and stormy night ..." tags : ["business", "news", "north america"], votes : 3, voters : ["dmerr", "sj", "jane" ] } > db.posts.update( { }, – query for documents to update { } – update to perform ) 42
  • 43. { _id : ObjectId("4e2e3f92268cdda473b628f6"), title : "My Very Important Thoughts", published: ISODate("2011-07-26T19:49:00.147Z"), author : { name:"Asya Kamsky", username:"asya" }, text : "It was a long and stormy night ..." tags : ["business", "news", "north america"], votes : 3, voters : ["dmerr", "sj", "jane" ] } > db.posts.update( {_id:..., voters:{$ne:"asya"} }, { $push: {voters:"asya"}, $inc : {votes: 1} } ) 43
  • 44. { _id : ObjectId("4e2e3f92268cdda473b628f6"), title : "My Very Important Thoughts", published: ISODate("2011-07-26T19:49:00.147Z"), author : { name:"Asya Kamsky", username:"asya" }, text : "It was a long and stormy night ..." tags : ["business", "news", "north america"], votes : 4, voters : ["dmerr", "sj", "jane", "asya" ], comments : [ { by : "tim157", text : "great story", ... }, { by : "gora", text : "i don’t think so", ... }, { by : "dmerr", text : "also check out..." } ] } 44
  • 45. { _id : ObjectId("4e2e3f92268cdda473b628f6"), title : "My Very Important Thoughts", published: ISODate("2011-07-26T19:49:00.147Z"), author : { name:"Asya Kamsky", username:"asya" }, text : "It was a long and stormy night ..." tags : ["business", "news", "north america"], votes : 4, voters : ["dmerr", "sj", "jane","asya" ], comments : [ { by : "tim157", text : "great story" }, { by : "gora", text : "i don’t think so" }, { by : "dmerr", text : "also check out..." } ] } > db.posts.ensureIndex( { "comments.by" : 1 } ) 45
  • 46. { _id : ObjectId("4e2e3f92268cdda473b628f6"), title : "My Very Important Thoughts", published: ISODate("2011-07-26T19:49:00.147Z"), author : { name:"Asya Kamsky", username:"asya" }, text : "It was a long and stormy night ..." tags : ["business", "news", "north america"], votes : 4, voters : ["dmerr", "sj", "jane","asya" ], comments : [ { by : "tim157", text : "great story" }, { by : "gora", text : "i don’t think so" }, { by : "dmerr", text : "also check out..." } ] } > db.posts.find( { "comments.by" : "gora" } ) 46
  • 47. Seek = 5+ ms Read = really really fast Post Comment Author 47
  • 48. Disk seeks and data locality Post Author Comment Comment Comment Comment Comment 48
  • 49. • High Availability • Data Redundancy • Increase capacity with no downtime • Transparent to the application 49
  • 50. A cluster of N servers Pick me! • Any (one) node can be primary • All writes to primary Node 1 • Reads go to primary (default) Node 2 optionally to a secondary • Consensus election of primary Primary Node 3 • Automatic failover • Automatic recovery 50
  • 51. Replica Sets • High Availability/Automatic Failover • Data Redundancy • Disaster Recovery • Transparent to the application • Perform maintenance with no down time 51
  • 55. 55
  • 57. 57
  • 58. Increase capacity with no downtime • Transparent to the application • Range based partitioning • Partitioning and balancing is automatic 58
  • 59. Key Range Key Range Key Range Key Range min..25 26..50 51..75 76.. max Primary Primary Primary Primary Secondary Secondary Secondary Secondary Secondary Secondary Secondary Secondary 59
  • 60. Application MongoS Key Range Key Range Key Range Key Range min..25 26..50 51..75 76.. max Primary Primary Primary Primary Secondary Secondary Secondary Secondary Secondary Secondary Secondary Secondary 60
  • 61. Application MongoS MongoS MongoS Key Range Key Range Key Range Key Range min..25 26..50 51..75 76.. max Primary Primary Primary Primary Secondary Secondary Secondary Secondary Secondary Secondary Secondary Secondary 61
  • 62. Application Application Application Application MongoS Config MongoS Config MongoS MongoS Config MongoS Key Range Key Range Key Range Key Range min..25 26..50 51..75 76.. max Primary Primary Primary Primary Secondary Secondary Secondary Secondary Secondary Secondary Secondary Secondary 62
  • 63. • Few configuration options • Does the right thing out of the box • Easy to deploy and manage 63
  • 64. Better data locality In-Memory Auto-Sharding Caching Read scaling Write scaling Relational MongoDB We just can't get any faster than the way MongoDB handles our data. Tony Tam CTO, Wordnik 64
  • 65. • Supported Platforms: – Linux, Windows, Solaris, Mac OS X – Packages available for all popular distributions No external/third party software dependencies 10gen maintains drivers for over dozen languages 65
  • 66. Content Management Operational Intelligence E-Commerce User Data Management High Volume Data Feeds 66
  • 67. 67
  • 68. Open source, high performance database 68