NOSQL, COUCHDB
 AND THE CLOUD
    Brad Anderson
       Cloudant




          1
BRAD ANDERSON

• BS   Hotel Management

• Restaurant   Chain Data - econometric modeling, BI/DW

• Open   Source - trac, dsource.org, couchdb

• NOSQLEast     2009

• Cloudant

• http://twitter.com/boorad


                                2
AGENDA

• NOSQL

• COUCHDB

 •   Erlang

• Cloud

 •   Dynamo

 •   MapReduce


                   3
IF YOU
• don’t have ‘medium data’
 or ‘big data’

• arecool with 25K loc
 object-relational mappers

• love   an ops challenge

• areokay paying Uncle
 Larry



                              4
IF YOU
• don’t have ‘medium data’
 or ‘big data’

• arecool with 25K loc
 object-relational mappers

• love   an ops challenge

• areokay paying Uncle
 Larry



                              4
IF NOT




http://www.bigfatmoneybags.com/blog/wp-content/uploads/2009/12/screwed.jpg

                                                                             5
RELATIONAL DATABASES


                                                           RDBMS
• Rigid         Schema / ORM fun
                                                           1970-2010
• Scale          Up

• Everything                 is a Nail



http://www.flickr.com/photos/36041246@N00/3419197777/

                                                       6
SCALING RDBMS

•   Replication Sucks
    •   master-slave
    •   master-master
•   Partitioning Sucks
    •   vertical (by functional area)
    •   horizontal (by some key, say time)
•   Caching sort of works


                                        7
OKAY, NOT SCREWED




http://www.bigfatmoneybags.com/blog/wp-content/uploads/2009/12/screwed.jpg

                                                                             8
RELATIONAL DATABASES


                                                           RDBMS
• Not          Dead                                        1970-

• Just       have a ‘smell’ for certain tasks




http://www.flickr.com/photos/36041246@N00/3419197777/

                                                       9
NOSQL

        NOT ONLY SQL
   A moniker for different data storage systems
         solving very different problems,
all where a relational database is not the right fit.

                         10
RIGHT FIT

• Google    indexes 400 Pb / day (2007)

• CERN, LHC        generates 100 Pb / sec

• Unique    data created each year (IDC, 2007)
  •   2007 40 Eb

  •   2010 988 Eb (exponential growth)

• Flightcaster


                                     11
FOUR CATEGORIES
• Key/Value     Stores
   •   Dynomite, Voldemort, Tokyo

• Document        Stores
   •   CouchDB, MongoDB

• Column     Stores / BigTable
   •   HBase, Hypertable, Cassandra

• Graph    Databases
   •   Neo4j, AllegroGraph, VertexDB


                                       12
BIG TAKEAWAY

       function




data              data

data              data

data              data

data              data

data              data




                         13
BIG TAKEAWAY

                                                           function

       function                                             data
                                                function              function

                                                 data                  data



                                     function                                    function
data              data
                                      data                                        data
data              data

data              data
                                     function                                    function
data              data
                                      data                                        data
data              data

                                                function              function

                                                 data                  data
                                                           function

                                                            data




                  Bring the function to the data
                                13
14
HUH? ERLANG?


• Programming    Language created at Ericsson (20 yrs
 old now)

• Designed   for scalable, long-lived systems

• Compiled, Functional, Dynamically Typed, Open
 Source




                                  15
3 BIGGIES
• Massively    Concurrent

    •   green threads, very lightweight != os threads


• Seamlessly     Distributed

    •   node = os thread = VM, processes can live anywhere


• Fault Tolerant


    •   99.9999999 = 32ms downtime per year - AXD301


                                         16
Of
                  fi   cia
                         lB
                              et
                                a!




CouchDB
 Apache


          17
COUCHDB
• Schema-free      document database server

• Robust, highly   concurrent, fault-tolerant

• RESTful   JSON API

• Futon   web admin console

• MapReduce     system for generating custom views

• Bi-directional   incremental replication

• couchapp: lightweight
                    HTML+JavaScript apps served directly
 from CouchDB using views to transform JSON
                                   18
FROM INTEREST TO ADOPTION




• 100+   production users          • Active
                                          commercial
•3
                                    development
     books being written
                                   • Rapidly   maturing
• Vibrant, open   community   19
OF THE WEB

   Django may be built for the Web, but
  CouchDB is built of the Web. I've never
seen software that so completely embraces
  the philosophies behind HTTP ... this is
 what the software of the future looks like.


                Jacob Kaplan-Moss
                 October 17 2007

   http://jacobian.org/writing/of-the-web/
                       20
DOCUMENTS




• Documents   are JSON Objects

• Underscore-prefixed   fields are reserved

• Documents   can have binary attachments

• MVCC   _rev deterministically generated from doc content
                               21
ROBUST

• Never   overwrite previously committed data

• In
   the event of a server crash or power failure, just restart
 CouchDB -- there is no “repair”

• Take   snapshots with “cp”

• Configurable levels of durability: can choose to fsync after
 every update, or less often to gain better throughput



                                22
CONCURRENT

• Erlang
       approach: lightweight processes to model the natural
 concurrency in a problem

• For   CouchDB that means one process per TCP connection

• Lock-free
          architecture; each process works with an MVCC
 snapshot of a DB.

• Performance   degrades gracefully under heavy concurrent load


                               23
REST API
• Create
 PUT /mydb/mydocid

• Retrieve
 GET /mydb/mydocid

• Update
 PUT /mydb/mydocid

• Delete
 DELETE /mydb/mydocid

                        24
25
VIEWS
• Custom, persistent   representations of document data

• “Closeto the metal” -- no dynamic queries in production, so
 you know exactly what you’re getting

• Generated using MapReduce functions written in JavaScript
 (and other languages)

     view must have a map function and may also have a
• Each
 reduce function

• Leverages   view collation, rich view query API
                                 26
DOCUMENTS BY AUTHOR




         27
INCREMENTAL
• Computing   a view can be expensive, so CouchDB saves the
 result in a B-tree and keeps it up-to-date

• Leafnodes store map results, inner nodes store reductions of
 children




 http://horicky.blogspot.com/2008/10/couchdb-implementation.html
                               28
REPLICATION
• Peer-based, bi-directional   replication using normal HTTP calls

• Mediated  by a replicator process which can live on the
 source, target, or somewhere else entirely

• Replicate
          a subset of documents in a DB meeting criteria
 defined in a custom filter function (coming soon)

• Applications   (_design documents) replicate along with the
 data

• Ideal   for offline applications -- “ground computing”
                                   29
CLOUD




  30
SHOWROOM
 A cluster of couches




          31
Help me, this name sucks!




SHOWROOM
 A cluster of couches




          31
ARCHITECTURE

• Each   cluster is a ring of nodes (Dynamo, Dynomite)

• Any    node can handle request (consistent hashing)

  • O(1), with   a hop

• nodes    own partitions (ring is divided)

• data   are distributed evenly across partitions and replicas

• mapreduce     functions are passed to nodes for execution
RESEARCH


• Google’s   MapReduce, http://bit.ly/bJbyq5

• Amazon’s   Dynamo, http://bit.ly/b7FlsN

• CAP   theorem, http://bit.ly/bERr2H
CLUSTER CONTROLS

•N   - Replication
                     Q
•Q   - Partitions = 2

•R   - Read Quorum

•W   - Write Quorum



• These   constants define the cluster
N


                                              Consistency
Throughput
                                               Durability




    N = Number of replicas per item stored in cluster
Q


Throughput                                    Scalability




     2^Q = Number of partitions (shards) in cluster
           T = Number of nodes in cluster
       2^Q / T = Number of partitions per node
R


Latency                                      Consistency




          R = Number of successful reads before
               returning value(s) to client
W


Latency                                   Durability




      W = Number of successful writes before
           returning ‘success’ to client
Load Balancer




                                Node 1

            24                                           No
       de                A     B     C       D              de
    No                                           B
                                                                 2
                     A
                 Z                                   C
     Y                                                      D
X                                                                    E


                                                                         C       N
                                                                                  od
                                                                                     e
                                                                             D           3

                                                                                 E

                                                                                             F




                                                                                                 D



                                                                                                             No
                                                                                                             de
                                                                                                     E



                                                                                                              4
                                                                                                         F
                                                                                                             G
request

    PUT http://boorad.cloudant.com/dbname/blah?w=2




                             Load Balancer




                                Node 1

            24                                           No
       de                A     B     C       D              de
    No                                           B
                                                                 2
                     A
                 Z                                   C
      Y                                                     D
X                                                                    E


                                                                         C       N
                                                                                  od
                                                                                     e
                                                                             D           3

                                                                                 E

                                                                                             F




                                                                                                 D



                                                                                                             No
                                                                                                             de
                                                                                                     E



                                                                                                              4
                                                                                                         F
                                                                                                             G
request

    PUT http://boorad.cloudant.com/dbname/blah?w=2




                             Load Balancer




                                Node 1

            24                                           No
       de                A     B     C       D              de
    No                                           B
                                                                 2
                     A
                 Z                                   C
      Y                                                     D
X                                                                    E


                                                                         C       N
                                                                                  od
                                                                                     e
                                                                             D           3

                                                                                 E

                                                                                             F




                                                                                                 D



                                                                                                             No
                                                                                                             de
                                                                                                     E



                                                                                                              4
                                                                                                         F
                                                                                                             G
request

    PUT http://boorad.cloudant.com/dbname/blah?w=2




                             Load Balancer




                                Node 1

            24                                           No
       de                A     B     C       D              de
    No                                           B
                                                                 2
                     A
                 Z                                   C
      Y                                                     D
X                        hash(blah) = E                              E


                                                                         C       N
                                                                                  od
                                                                                     e
                                                                             D           3

                                                                                 E

                                                                                             F




                                                                                                 D



                                                                                                             No
                                                                                                             de
                                                                                                     E



                                                                                                              4
                                                                                                         F
                                                                                                             G
request

    PUT http://boorad.cloudant.com/dbname/blah?w=2

                                                                                                         N=3
                                                                                                         W=2
                             Load Balancer
                                                                                                         R=2

                                Node 1

            24                                           No
       de                A     B     C       D              de
    No                                           B
                                                                 2
                     A
                 Z                                   C
      Y                                                     D
X                        hash(blah) = E                              E


                                                                         C       N
                                                                                  od
                                                                                     e
                                                                             D           3

                                                                                 E

                                                                                             F




                                                                                                 D



                                                                                                             No
                                                                                                             de
                                                                                                     E



                                                                                                              4
                                                                                                         F
                                                                                                             G
request

    PUT http://boorad.cloudant.com/dbname/blah?w=2

                                                                                                         N=3
                                                                                                         W=2
                             Load Balancer
                                                                                                         R=2

                                Node 1

            24                                           No
       de                A     B     C       D              de
    No                                           B
                                                                 2
                     A
                 Z                                   C
      Y                                                     D
X                        hash(blah) = E                              E


                                                                         C       N
                                                                                  od
                                                                                     e
                                                                             D           3

                                                                                 E

                                                                                             F




                                                                                                 D



                                                                                                             No
                                                                                                             de
                                                                                                     E



                                                                                                              4
                                                                                                         F
                                                                                                             G
request

    PUT http://boorad.cloudant.com/dbname/blah?w=2

                                                                                                         N=3
                                                                                                         W=2
                             Load Balancer
                                                                                                         R=2


            24
                                Node 1

                                                         No
                                                                                                 node down
       de                A     B     C       D              de
    No                                           B
                                                                 2
                     A
                 Z                                   C
      Y                                                     D
X                        hash(blah) = E                              E


                                                                         C       N
                                                                                  od
                                                                                     e
                                                                             D           3

                                                                                 E

                                                                                             F




                                                                                                 D



                                                                                                             No
                                                                                                             de
                                                                                                     E



                                                                                                              4
                                                                                                         F
                                                                                                             G
RESULT

• For   standalone or cluster
  •   one REST API

  •   one URL

• For   cluster
  •   redundant data

  •   distributed queries

  •   scale out

                                40
DEMO?




  41
QUESTIONS?
CREDITS



• Emil    Eifrem, http://bit.ly/5D40WQ

• Sergio    Bossa, http://bit.ly/c9UoRZ

• Cliff   Moon, http://bit.ly/bX887c




                                   43

DevNation Atlanta

  • 1.
    NOSQL, COUCHDB ANDTHE CLOUD Brad Anderson Cloudant 1
  • 2.
    BRAD ANDERSON • BS Hotel Management • Restaurant Chain Data - econometric modeling, BI/DW • Open Source - trac, dsource.org, couchdb • NOSQLEast 2009 • Cloudant • http://twitter.com/boorad 2
  • 3.
    AGENDA • NOSQL • COUCHDB • Erlang • Cloud • Dynamo • MapReduce 3
  • 4.
    IF YOU • don’thave ‘medium data’ or ‘big data’ • arecool with 25K loc object-relational mappers • love an ops challenge • areokay paying Uncle Larry 4
  • 5.
    IF YOU • don’thave ‘medium data’ or ‘big data’ • arecool with 25K loc object-relational mappers • love an ops challenge • areokay paying Uncle Larry 4
  • 6.
  • 7.
    RELATIONAL DATABASES RDBMS • Rigid Schema / ORM fun 1970-2010 • Scale Up • Everything is a Nail http://www.flickr.com/photos/36041246@N00/3419197777/ 6
  • 8.
    SCALING RDBMS • Replication Sucks • master-slave • master-master • Partitioning Sucks • vertical (by functional area) • horizontal (by some key, say time) • Caching sort of works 7
  • 9.
  • 10.
    RELATIONAL DATABASES RDBMS • Not Dead 1970- • Just have a ‘smell’ for certain tasks http://www.flickr.com/photos/36041246@N00/3419197777/ 9
  • 11.
    NOSQL NOT ONLY SQL A moniker for different data storage systems solving very different problems, all where a relational database is not the right fit. 10
  • 12.
    RIGHT FIT • Google indexes 400 Pb / day (2007) • CERN, LHC generates 100 Pb / sec • Unique data created each year (IDC, 2007) • 2007 40 Eb • 2010 988 Eb (exponential growth) • Flightcaster 11
  • 13.
    FOUR CATEGORIES • Key/Value Stores • Dynomite, Voldemort, Tokyo • Document Stores • CouchDB, MongoDB • Column Stores / BigTable • HBase, Hypertable, Cassandra • Graph Databases • Neo4j, AllegroGraph, VertexDB 12
  • 14.
    BIG TAKEAWAY function data data data data data data data data data data 13
  • 15.
    BIG TAKEAWAY function function data function function data data function function data data data data data data data data function function data data data data data data function function data data function data Bring the function to the data 13
  • 16.
  • 17.
    HUH? ERLANG? • Programming Language created at Ericsson (20 yrs old now) • Designed for scalable, long-lived systems • Compiled, Functional, Dynamically Typed, Open Source 15
  • 18.
    3 BIGGIES • Massively Concurrent • green threads, very lightweight != os threads • Seamlessly Distributed • node = os thread = VM, processes can live anywhere • Fault Tolerant • 99.9999999 = 32ms downtime per year - AXD301 16
  • 19.
    Of fi cia lB et a! CouchDB Apache 17
  • 20.
    COUCHDB • Schema-free document database server • Robust, highly concurrent, fault-tolerant • RESTful JSON API • Futon web admin console • MapReduce system for generating custom views • Bi-directional incremental replication • couchapp: lightweight HTML+JavaScript apps served directly from CouchDB using views to transform JSON 18
  • 21.
    FROM INTEREST TOADOPTION • 100+ production users • Active commercial •3 development books being written • Rapidly maturing • Vibrant, open community 19
  • 22.
    OF THE WEB Django may be built for the Web, but CouchDB is built of the Web. I've never seen software that so completely embraces the philosophies behind HTTP ... this is what the software of the future looks like. Jacob Kaplan-Moss October 17 2007 http://jacobian.org/writing/of-the-web/ 20
  • 23.
    DOCUMENTS • Documents are JSON Objects • Underscore-prefixed fields are reserved • Documents can have binary attachments • MVCC _rev deterministically generated from doc content 21
  • 24.
    ROBUST • Never overwrite previously committed data • In the event of a server crash or power failure, just restart CouchDB -- there is no “repair” • Take snapshots with “cp” • Configurable levels of durability: can choose to fsync after every update, or less often to gain better throughput 22
  • 25.
    CONCURRENT • Erlang approach: lightweight processes to model the natural concurrency in a problem • For CouchDB that means one process per TCP connection • Lock-free architecture; each process works with an MVCC snapshot of a DB. • Performance degrades gracefully under heavy concurrent load 23
  • 26.
    REST API • Create PUT /mydb/mydocid • Retrieve GET /mydb/mydocid • Update PUT /mydb/mydocid • Delete DELETE /mydb/mydocid 24
  • 27.
  • 28.
    VIEWS • Custom, persistent representations of document data • “Closeto the metal” -- no dynamic queries in production, so you know exactly what you’re getting • Generated using MapReduce functions written in JavaScript (and other languages) view must have a map function and may also have a • Each reduce function • Leverages view collation, rich view query API 26
  • 29.
  • 30.
    INCREMENTAL • Computing a view can be expensive, so CouchDB saves the result in a B-tree and keeps it up-to-date • Leafnodes store map results, inner nodes store reductions of children http://horicky.blogspot.com/2008/10/couchdb-implementation.html 28
  • 31.
    REPLICATION • Peer-based, bi-directional replication using normal HTTP calls • Mediated by a replicator process which can live on the source, target, or somewhere else entirely • Replicate a subset of documents in a DB meeting criteria defined in a custom filter function (coming soon) • Applications (_design documents) replicate along with the data • Ideal for offline applications -- “ground computing” 29
  • 32.
  • 33.
    SHOWROOM A clusterof couches 31
  • 34.
    Help me, thisname sucks! SHOWROOM A cluster of couches 31
  • 35.
    ARCHITECTURE • Each cluster is a ring of nodes (Dynamo, Dynomite) • Any node can handle request (consistent hashing) • O(1), with a hop • nodes own partitions (ring is divided) • data are distributed evenly across partitions and replicas • mapreduce functions are passed to nodes for execution
  • 36.
    RESEARCH • Google’s MapReduce, http://bit.ly/bJbyq5 • Amazon’s Dynamo, http://bit.ly/b7FlsN • CAP theorem, http://bit.ly/bERr2H
  • 37.
    CLUSTER CONTROLS •N - Replication Q •Q - Partitions = 2 •R - Read Quorum •W - Write Quorum • These constants define the cluster
  • 38.
    N Consistency Throughput Durability N = Number of replicas per item stored in cluster
  • 39.
    Q Throughput Scalability 2^Q = Number of partitions (shards) in cluster T = Number of nodes in cluster 2^Q / T = Number of partitions per node
  • 40.
    R Latency Consistency R = Number of successful reads before returning value(s) to client
  • 41.
    W Latency Durability W = Number of successful writes before returning ‘success’ to client
  • 42.
    Load Balancer Node 1 24 No de A B C D de No B 2 A Z C Y D X E C N od e D 3 E F D No de E 4 F G
  • 43.
    request PUT http://boorad.cloudant.com/dbname/blah?w=2 Load Balancer Node 1 24 No de A B C D de No B 2 A Z C Y D X E C N od e D 3 E F D No de E 4 F G
  • 44.
    request PUT http://boorad.cloudant.com/dbname/blah?w=2 Load Balancer Node 1 24 No de A B C D de No B 2 A Z C Y D X E C N od e D 3 E F D No de E 4 F G
  • 45.
    request PUT http://boorad.cloudant.com/dbname/blah?w=2 Load Balancer Node 1 24 No de A B C D de No B 2 A Z C Y D X hash(blah) = E E C N od e D 3 E F D No de E 4 F G
  • 46.
    request PUT http://boorad.cloudant.com/dbname/blah?w=2 N=3 W=2 Load Balancer R=2 Node 1 24 No de A B C D de No B 2 A Z C Y D X hash(blah) = E E C N od e D 3 E F D No de E 4 F G
  • 47.
    request PUT http://boorad.cloudant.com/dbname/blah?w=2 N=3 W=2 Load Balancer R=2 Node 1 24 No de A B C D de No B 2 A Z C Y D X hash(blah) = E E C N od e D 3 E F D No de E 4 F G
  • 48.
    request PUT http://boorad.cloudant.com/dbname/blah?w=2 N=3 W=2 Load Balancer R=2 24 Node 1 No node down de A B C D de No B 2 A Z C Y D X hash(blah) = E E C N od e D 3 E F D No de E 4 F G
  • 49.
    RESULT • For standalone or cluster • one REST API • one URL • For cluster • redundant data • distributed queries • scale out 40
  • 50.
  • 51.
  • 52.
    CREDITS • Emil Eifrem, http://bit.ly/5D40WQ • Sergio Bossa, http://bit.ly/c9UoRZ • Cliff Moon, http://bit.ly/bX887c 43

Editor's Notes

  • #5 You are not Google
  • #16 Right Tool for the Job
  • #17 small companies huge data information from data
  • #21 20 yrs old, open source since mid-90’s, iirc. like a mobile telephone grid compiled (but to bytecode for a VM) open source
  • #23 Cluster Of Unreliable Commodity Hardware