0
Not only SQL
   Mårten Gustafson
  Qbranch CODE tech-meet @ 2010-04-15
What?
“NoSQL is a movement promoting a loosely
defined class of non-relational data stores that
break with a long history o...
What?
“NoSQL is a movement promoting a loosely
defined class of non-relational data stores
that break with a long history o...
Why?

• Non-relational
• Schema-less
• “Easily” scalable
• REST/JSON API = web friendly
What’s out there?
                 Storage type        License      Implemented in
Amazon Dynamo      Key/Value           ...
Distribution


• Master / Slave
• Master / Slave(s)
• Masterless (Master / Master)
Distribution
                       Masterless   Master/Slave   Hot standby
 Amazon Dynamo             X
     Cassandra   ...
Distribution
                           Masterless     Master/Slave
                                                      ...
Common factor


   “...of the web...”
     Of the who?!
Of the web
“...Django may be built for the Web, but
CouchDB is built of the Web. I’ve never seen
software that so complete...
Of the web

“...CouchDB may succeeded, and it may fail; who
knows. I’m sure of one thing, though — this is
what the softwa...
So freakin’ what?!



All your webish skillz and tools apply...
So freakin’ what?!
 language-, platform- and OS-neutral

load balancers                         proxies

                 ...
These guys can just suck it




  HTTP/REST is integration that works
                (YMMV)
Buckle Up Dorothy. Cause' Kansas, Is Going Bye-Bye
Key/Value Store


     I got keys but no locks
Riak

Decentralized key-value store
A flexible map/reduce engine
HTTP/JSON API
A database ideally suited for Web applicatio...
The Ring
The Ring
            12     1
     11                    2

10                             3

9                           ...
The Ring


One Ring size to rule them all, One Ring size to
find them, One Ring size to bring them all and in
the cluster b...
Consistent Hashing
 Store/Save (PUT)
Consistent Hashing
 Store/Save (PUT)
Consistent Hashing
          Read (GET)
“I want “ is
answered by:
where is on
the ring?
Consistent Hashing
          Read (GET)
“I want “ is
answered by:
where is on
the ring?
Cluster

                                            Instance A


                                            Instance B

...
Cluster

                                            Instance A


                                            Instance B

...
Cluster - Read (GET)
  Instance A   Instance B   Instance C
Cluster - Read (GET)
I can haz    ?



                       Instance A   Instance B   Instance C

Hm,       lives in a
s...
Cluster - Read (GET)
I can haz   ?



                 Instance A   Instance B   Instance C

                             ...
Cluster - Read (GET)
I can haz   ?
                                                         Here ya go


                I...
Riak “stuff”
Riak “stuff”




           Bucket
  Container/keyspace.
Determines number of
replicas for its contents
Riak “stuff”



                            Consistent Hashing
                                     Key hashing technique
...
Shares state, bucket
and ring knowledge
   in the cluster
                                Riak “stuff”
              Gossi...
Shares state, bucket
and ring knowledge
   in the cluster
                                Riak “stuff”
              Gossi...
Shares state, bucket
and ring knowledge
   in the cluster
                                Riak “stuff”                    ...
Shares state, bucket
and ring knowledge
   in the cluster
                                 Riak “stuff”                   ...
Shares state, bucket
and ring knowledge
   in the cluster
                                 Riak “stuff”                   ...
Shares state, bucket
and ring knowledge
   in the cluster
                                 Riak “stuff”                   ...
Shares state, bucket
and ring knowledge
   in the cluster
                                 Riak “stuff”                   ...
Shares state, bucket
and ring knowledge
   in the cluster
                                 Riak “stuff”                   ...
Shares state, bucket
 and ring knowledge
    in the cluster
                                    Riak “stuff”              ...
Shares state, bucket
 and ring knowledge
    in the cluster
                                    Riak “stuff”              ...
Shares state, bucket
 and ring knowledge
    in the cluster
                                    Riak “stuff”              ...
Riak - Takeaways

• No single point of failure
• Choose your levels for:
 • availability
 • consistency
 • partition toler...
But wait, there’s more...

•   Binary data + Content-Type = whatever

    •   MP3’s, Images, Text, ...

•   Map/Reduce

  ...
This slide intentionally left blank
Document Store


      Relax
CouchDB

Document oriented databased
Kick ass replication
HTTP/JSON API
Map/reduce view (index) definitions
World view
One document == JSON
One document == One record
Many documents == One database
Many databases == One instance
N...
World view

Documents can
  have attachments (binary + mime type)
  be rendered differently (HTML, XML)
A document
                                                       Key, either you
                                        ...
Views

Filter
Collate
Aggregate
Views
{
    "_id": "b098445d587b1f347e48e1a79301de02",
    "_rev": "1-80bfd8302e0f08eec2396c8107cafc19",
    "platform": {...
Views
Views are stored
     as an accessible web resource
     on disk
     and incrementally updated
     as well as repl...
Replication
Peer to peer
Online/Offline
Conflict detection and resolution
Any number of nodes
     Local
     Remote
Replication
Replication
Replication
Replication
Replication
CouchDB “stuff”
CouchDB “stuff”


                          Append only
        Hence, won’t corrupt
            its data files
CouchDB “stuff”
       MVCC

Multi version concurrency control.
 Writers do not block readers.
 Readers do not block write...
CouchDB “stuff”
                             BDCRR
       MVCC                              Bi-directional, conflict
      ...
CouchDB “stuff”
                                      BDCRR
                MVCC                              Bi-direction...
CouchDB “stuff”
                                      BDCRR
                MVCC                                   Bi-dire...
CouchDB - Takeaways

• Kick ass replication
• Views are fast
• Can host and serve complete webapps
Outro

• Test one or more NoSQL thingys
• Get familiar with Brewers CAP theorem
• Get familiar with the Dynamo paper
Over and out.

Mårten Gustafson
@martengustafson

http://marten.gustafson.pp.se/

marten.gustafson@gmail.com
Upcoming SlideShare
Loading in...5
×

NoSQL @ Qbranch -2010-04-15

8,603

Published on

NoSQL overview presentation with details on Riak and CouchDB.

Presented at Qbranch CODE Night 2010-04-15.


Thanks to @frli01 for arranging and @xlson for invitation.

Published in: Technology
0 Comments
23 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
8,603
On Slideshare
0
From Embeds
0
Number of Embeds
9
Actions
Shares
0
Downloads
200
Comments
0
Likes
23
Embeds 0
No embeds

No notes for slide



  • * Relational not always most suitable model
    * Schema-less gives freedom
    * Non-relational gives interesting scalability capabilities (which most provides)
    * Most provides REST/JSON API
    ** Very suitable for web dev’t
    ** Easy peasy to use, regardless of environment





















































  • * Hinted handoff
  • * Hinted handoff
  • * Hinted handoff
  • * Hinted handoff
  • * Hinted handoff
  • * Hinted handoff
  • * Hinted handoff
  • * Hinted handoff
  • * Hinted handoff
  • * Hinted handoff
  • * Hinted handoff
  • * Hinted handoff
  • * Hinted handoff
  • * Hinted handoff
  • * Hinted handoff
  • * Hinted handoff


































  • collation - assembling in proper numerical or logical sequence
  • Simplified view explanation










































  • Transcript of "NoSQL @ Qbranch -2010-04-15"

    1. 1. Not only SQL Mårten Gustafson Qbranch CODE tech-meet @ 2010-04-15
    2. 2. What? “NoSQL is a movement promoting a loosely defined class of non-relational data stores that break with a long history of relational databases” - Wikipedia
    3. 3. What? “NoSQL is a movement promoting a loosely defined class of non-relational data stores that break with a long history of relational databases” - Wikipedia Not a single technique Not a single type of data Not a single type of use case
    4. 4. Why? • Non-relational • Schema-less • “Easily” scalable • REST/JSON API = web friendly
    5. 5. What’s out there? Storage type License Implemented in Amazon Dynamo Key/Value n/a ? Cassandra Columnfamily ASL 2.0 Java CouchDB Document ASL 2.0 Erlang Dynomite Key/Value BSD/MIT-style Erlang HBase Columnfamily ASL 2.0 Java MongoDB Document AGPL v3.0 C++ Neo4J Graph AGPL v3.0 / Comm Java Riak Key/Value ASL 2.0 Erlang Redis Key/Value BSD/MIT-style C Scalaris Key/Value ASL 2.0 Erlang Tokyo Cabinet Key/Value LGPL C Voldemort Key/Value ASL 2.0 Java
    6. 6. Distribution • Master / Slave • Master / Slave(s) • Masterless (Master / Master)
    7. 7. Distribution Masterless Master/Slave Hot standby Amazon Dynamo X Cassandra X CouchDB X Dynomite X HBase ? MongoDB X X Neo4J* Riak X Redis X Scalaris X Tokyo Cabinet Voldemort X * Neo4J HA coming “soon”
    8. 8. Distribution Masterless Master/Slave ie Hot standbyw Amazon Dynamo X d v Cassandra X ifie l CouchDB X Dynomite X m p i HBase ? MongoDB y s X X Neo4J* e r v Riak X a Redis X is Scalaris X h i s Tokyo Cabinet Voldemort X T * Neo4J HA coming “soon”
    9. 9. Common factor “...of the web...” Of the who?!
    10. 10. Of the web “...Django may be built for the Web, but CouchDB is built of the Web. I’ve never seen software that so completely embraces the philosophies behind HTTP. CouchDB makes Django look old-school in the same way that Django makes ASP look outdated” - http://jacobian.org/writing/of-the-web/
    11. 11. Of the web “...CouchDB may succeeded, and it may fail; who knows. I’m sure of one thing, though — this is what the software of the future looks like” - http://jacobian.org/writing/of-the-web/
    12. 12. So freakin’ what?! All your webish skillz and tools apply...
    13. 13. So freakin’ what?! language-, platform- and OS-neutral load balancers proxies MIME / Content-Type All your webish skillz and tools apply... HTTP client libs (etag, if-modified-since, etc) caches
    14. 14. These guys can just suck it HTTP/REST is integration that works (YMMV)
    15. 15. Buckle Up Dorothy. Cause' Kansas, Is Going Bye-Bye
    16. 16. Key/Value Store I got keys but no locks
    17. 17. Riak Decentralized key-value store A flexible map/reduce engine HTTP/JSON API A database ideally suited for Web applications
    18. 18. The Ring
    19. 19. The Ring 12 1 11 2 10 3 9 4 8 5 7 6 ring size = 12
    20. 20. The Ring One Ring size to rule them all, One Ring size to find them, One Ring size to bring them all and in the cluster bind them...
    21. 21. Consistent Hashing Store/Save (PUT)
    22. 22. Consistent Hashing Store/Save (PUT)
    23. 23. Consistent Hashing Read (GET) “I want “ is answered by: where is on the ring?
    24. 24. Consistent Hashing Read (GET) “I want “ is answered by: where is on the ring?
    25. 25. Cluster Instance A Instance B Instance C ring size = 12 instances = 3 ring size / nodes = ~slices per instances
    26. 26. Cluster Instance A Instance B Instance C ring size = 12 instances = 3 ring size / nodes = ~slices per instances
    27. 27. Cluster - Read (GET) Instance A Instance B Instance C
    28. 28. Cluster - Read (GET) I can haz ? Instance A Instance B Instance C Hm, lives in a slice of the ring owned by instance C.
    29. 29. Cluster - Read (GET) I can haz ? Instance A Instance B Instance C Okidoki, now Hey C! I need where’s he...a yeah in my fourth slice
    30. 30. Cluster - Read (GET) I can haz ? Here ya go Instance A Instance B Instance C Cheers!
    31. 31. Riak “stuff”
    32. 32. Riak “stuff” Bucket Container/keyspace. Determines number of replicas for its contents
    33. 33. Riak “stuff” Consistent Hashing Key hashing technique used to distribute keys on the ring Bucket Container/keyspace. Determines number of replicas for its contents
    34. 34. Shares state, bucket and ring knowledge in the cluster Riak “stuff” Gossiping Consistent Hashing Key hashing technique used to distribute keys on the ring Bucket Container/keyspace. Determines number of replicas for its contents
    35. 35. Shares state, bucket and ring knowledge in the cluster Riak “stuff” Gossiping Consistent Hashing Key hashing technique used to distribute keys on the ring Hinted Handoff Covering for a Bucket failed “neighbor” node while gone Container/keyspace. Determines number of replicas for its contents
    36. 36. Shares state, bucket and ring knowledge in the cluster Riak “stuff” Allows retrieval of “weakly” linked objects Gossiping Links Consistent Hashing Key hashing technique used to distribute keys on the ring Hinted Handoff Covering for a Bucket failed “neighbor” node while gone Container/keyspace. Determines number of replicas for its contents
    37. 37. Shares state, bucket and ring knowledge in the cluster Riak “stuff” Allows retrieval of “weakly” linked objects Gossiping Links Consistent Hashing Key hashing technique used to distribute keys on the ring Hinted Handoff Merkle Tree Covering for a Bucket Data structure for failed “neighbor” node while gone Container/keyspace. efficient summary about Determines number of objects. Gossiped. replicas for its contents
    38. 38. Shares state, bucket and ring knowledge in the cluster Riak “stuff” Allows retrieval of “weakly” linked objects Gossiping Links Node Consistent Hashing One server. Runs vnodes which claims partitions. Key hashing technique used to distribute keys on the ring Hinted Handoff Merkle Tree Covering for a Bucket Data structure for failed “neighbor” node while gone Container/keyspace. efficient summary about Determines number of objects. Gossiped. replicas for its contents
    39. 39. Shares state, bucket and ring knowledge in the cluster Riak “stuff” Allows retrieval of “weakly” linked objects Partition Gossiping Links One slice (part) of the ring. Node Consistent Hashing One server. Runs vnodes which claims partitions. Key hashing technique used to distribute keys on the ring Hinted Handoff Merkle Tree Covering for a Bucket Data structure for failed “neighbor” node while gone Container/keyspace. efficient summary about Determines number of objects. Gossiped. replicas for its contents
    40. 40. Shares state, bucket and ring knowledge in the cluster Riak “stuff” Allows retrieval of “weakly” linked objects Partition Gossiping Links One slice (part) of the ring. Node Auto correction of out-of-date objects Consistent Hashing One server. Runs vnodes which claims partitions. Read Repair Key hashing technique used to distribute keys on the ring Hinted Handoff Merkle Tree Covering for a Bucket Data structure for failed “neighbor” node while gone Container/keyspace. efficient summary about Determines number of objects. Gossiped. replicas for its contents
    41. 41. Shares state, bucket and ring knowledge in the cluster Riak “stuff” Allows retrieval of “weakly” linked objects Partition Gossiping Links One slice (part) of the ring. Node Auto correction of out-of-date objects Consistent Hashing One server. Runs vnodes which claims partitions. Read Repair Key hashing technique used to distribute keys on the ring Hinted Handoff Merkle Tree Covering for a Bucket Data structure for failed “neighbor” node while gone Number of copies Container/keyspace. efficient summary about of the same object Determines number of replicas for its contents objects. Gossiped. Replica in the cluster
    42. 42. Shares state, bucket and ring knowledge in the cluster Riak “stuff” Allows retrieval of “weakly” linked objects Partition Gossiping Links The complete “space”, One slice (part) of the ring. divided into partitions which are claimed by vnodes Ring Node Auto correction of out-of-date objects Consistent Hashing One server. Runs vnodes which claims partitions. Read Repair Key hashing technique used to distribute keys on the ring Hinted Handoff Merkle Tree Covering for a Bucket Data structure for failed “neighbor” node while gone Number of copies Container/keyspace. efficient summary about of the same object Determines number of replicas for its contents objects. Gossiped. Replica in the cluster
    43. 43. Shares state, bucket and ring knowledge in the cluster Riak “stuff” Allows retrieval of “weakly” linked objects Partition Gossiping Links The complete “space”, One slice (part) of the ring. divided into partitions which Vector Clock are claimed by vnodes Conflic detection technique for objects. Ring Node Auto correction of out-of-date objects Consistent Hashing One server. Runs vnodes which claims partitions. Read Repair Key hashing technique used to distribute keys on the ring Hinted Handoff Merkle Tree Covering for a Bucket Data structure for failed “neighbor” node while gone Number of copies Container/keyspace. efficient summary about of the same object Determines number of replicas for its contents objects. Gossiped. Replica in the cluster
    44. 44. Shares state, bucket and ring knowledge in the cluster Riak “stuff” Allows retrieval of “weakly” linked objects Partition Gossiping Links The complete “space”, One slice (part) of the ring. Vnode divided into partitions which Vector Clock are claimed by vnodes Conflic detection technique for objects. Ring Node Runs in a node and claims one Auto correction of out-of-date objects Consistent Hashing One server. Runs vnodes which claims partition on the ring partitions. Read Repair Key hashing technique used to distribute keys on the ring Hinted Handoff Merkle Tree Covering for a Bucket Data structure for failed “neighbor” node while gone Number of copies Container/keyspace. efficient summary about of the same object Determines number of replicas for its contents objects. Gossiped. Replica in the cluster
    45. 45. Riak - Takeaways • No single point of failure • Choose your levels for: • availability • consistency • partition tolerance
    46. 46. But wait, there’s more... • Binary data + Content-Type = whatever • MP3’s, Images, Text, ... • Map/Reduce • Local data, parallel
    47. 47. This slide intentionally left blank
    48. 48. Document Store Relax
    49. 49. CouchDB Document oriented databased Kick ass replication HTTP/JSON API Map/reduce view (index) definitions
    50. 50. World view One document == JSON One document == One record Many documents == One database Many databases == One instance No schema
    51. 51. World view Documents can have attachments (binary + mime type) be rendered differently (HTML, XML)
    52. 52. A document Key, either you choose it or CouchDB does it for you { "_id": "b098445d587b1f347e48e1a79301de02", "_rev": "1-80bfd8302e0f08eec2396c8107cafc19", "platform": { "browser": "mozilla", "version": "1.9.1.8" }, "timestamp": 1270131033337 Revision } number
    53. 53. Views Filter Collate Aggregate
    54. 54. Views { "_id": "b098445d587b1f347e48e1a79301de02", "_rev": "1-80bfd8302e0f08eec2396c8107cafc19", "platform": { "browser": "mozilla", "version": "1.9.1.8" }, "timestamp": 1270131033337 } + function(doc) { emit(doc.platform.browser, doc.platform.version); } = { "total_rows": 1, "offset": 0, "rows": [ "id": "b098445d587b1f347e48e1a79301de02", "key": "mozilla", "value": "1.9.1.8" ]
    55. 55. Views Views are stored as an accessible web resource on disk and incrementally updated as well as replicated with the database
    56. 56. Replication Peer to peer Online/Offline Conflict detection and resolution Any number of nodes Local Remote
    57. 57. Replication
    58. 58. Replication
    59. 59. Replication
    60. 60. Replication
    61. 61. Replication
    62. 62. CouchDB “stuff”
    63. 63. CouchDB “stuff” Append only Hence, won’t corrupt its data files
    64. 64. CouchDB “stuff” MVCC Multi version concurrency control. Writers do not block readers. Readers do not block writers. Append only Hence, won’t corrupt its data files
    65. 65. CouchDB “stuff” BDCRR MVCC Bi-directional, conflict resolving, replication Multi version concurrency control. Writers do not block readers. Readers do not block writers. Append only Hence, won’t corrupt its data files
    66. 66. CouchDB “stuff” BDCRR MVCC Bi-directional, conflict resolving, replication Multi version concurrency control. Writers do not block readers. Readers do not block writers. Append only Compaction Hence, won’t corrupt its data files Append only will cause data files to grow. Compaction to the rescue, in the background - for your pleasure.
    67. 67. CouchDB “stuff” BDCRR MVCC Bi-directional, conflict resolving, replication Multi version concurrency control. Writers do not block readers. Readers do not block writers. Append only Compaction Hence, won’t corrupt its data files Append only will cause data files to grow. Compaction to the rescue, in ACID the background - for your pleasure. Awesome, Cool, Impressive, Dope
    68. 68. CouchDB - Takeaways • Kick ass replication • Views are fast • Can host and serve complete webapps
    69. 69. Outro • Test one or more NoSQL thingys • Get familiar with Brewers CAP theorem • Get familiar with the Dynamo paper
    70. 70. Over and out. Mårten Gustafson @martengustafson http://marten.gustafson.pp.se/ marten.gustafson@gmail.com
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×