NoSQL – What’s that?<br />SergejusBarinovas | Microsoft MVP<br />@sergejusb, sergejus.blogas.lt<br />
NoSQL<br />
WHY?<br />
<ul><li>Limited SQL scalability</li></ul>Horizontal partitioning (sharding)<br />Vertical partitioning<br />NoSQL – Why?<b...
<ul><li>Limited SQL availability</li></ul>Master / slave configuration<br />NoSQL – Why?<br />
<ul><li>SQL limitations for storing huge amount of data</li></ul>Key / value / type columns<br />NoSQL – Why?<br />
<ul><li>Limited SQL speed of read/write operations</li></ul>Multiple read replicas<br />NoSQL – Why?<br />
<ul><li>2009, Eric Evans
NoSQL – open source distributed databases, not relational SQL databases
NoSQL – not only SQL
NoSQL->Big Data</li></ul>NoSQL History<br />
<ul><li>The ability to horizontally scale simple-operation throughput over many servers</li></ul>NoSQL Characteristics (sc...
<ul><li>A “weaker” concurrency model than the ACID transactions in most SQL systems</li></ul>NoSQL Characteristics (BASE)<...
<ul><li>Efficient use of distributed indexes and RAM for data storage</li></ul>NoSQL Characteristics (distributed)<br />
<ul><li>The ability to dynamically define new attributes or data schema</li></ul>NoSQL Characteristics (schema-less)<br />
<ul><li>Atomicity – all or nothing
Consistency – state integrity
Isolation – no reads of uncommitted data
Durability – recover committed trans</li></ul>ACID (transactions)<br />
<ul><li>2000, Eric Brewer</li></ul>It is impossible for a distributed computer system to simultaneously provide all three ...
Availability
Partition tolerance</li></ul>CAP Theorem<br />
<ul><li>Basically – partial system failures are OKAvailable
Soft state – inconsistency is OK
Eventual consistency – stale data is OK </li></ul>BASE (eventual consistency)<br />
NoSQL Databases<br />
<ul><li>Key / value store
Document database
Graph database
Columnar database</li></ul>NoSQL Categories<br />
<ul><li><key, value> or Tuple<key, v1,. ., vn>
Simple operations</li></ul>Get<br />Put<br />Delete<br />Key / value store<br />Key<br />Value<br />Byte[]<br />Byte[]<br />
Key / value store<br />Key<br />Value<br />“current_date”<br />2011.01.16<br />“sergejusb”<br />Binary Object<br />“sergej...
<ul><li>Dynamo*
Membase
Voldermort
Redis
Azure Table Storage
Riak</li></ul>Key / value store<br />
Name: Dynamo<br />Created: 2007, Amazon (proprietary)<br />Implementation: ?<br />Distributed: Yes<br />Replication: Multi...
Name: Membase<br />Created: 2010, sponsored by Zinga<br />Implementation: C / C++ / Erlang<br />Distributed: Yes<br />Repl...
Name: Voldemort<br />Created: 2008, LinkedIn<br />Implementation: Java<br />Distributed: Yes<br />Replication: Multiple Se...
Name: Redis<br />Created: 2009, sponsored by VMWare<br />Implementation: C<br />Distributed: No<br />Replication: Master /...
Name: Azure Table Storage<br />Created: 2008, Microsoft<br />Implementation: ?<br />Distributed: Yes<br />Replication: Mul...
Name: Riak<br />Created: 2008, Basho (from Akamai)<br />Implementation: Erlang<br />Distributed: Yes<br />Replication: Mul...
<ul><li>Document == complex object</li></ul>XML<br />YAML<br />JSON / BSON<br /><ul><li>Support for secondary indexes
Schema can be defined at runtime
Optional support for simple querying using Map / Reduce</li></ul>Document database<br />
<ul><li>MongoDB
CouchDB
RavenDB</li></ul>Document database<br />
Upcoming SlideShare
Loading in...5
×

NoSQL - what's that

2,931

Published on

Overview of NoSQL in general, its types and available most pop

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,931
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
109
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide
  • Atomicity. All of the operations in the transaction will complete, or none will.Consistency. The database will be in a consistent state when the transaction begins and ends.Isolation. The transaction will behave as if it is the only operation being performed upon the database.Durability. Upon completion of the transaction, the operation will not be reversed.
  • Consistency. The client perceives that a set of operations has occurred all at once.Availability. Every operation must terminate in an intended response.Partition tolerance. Operations will complete, even if individual components are unavailable.http://www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf
  • Basically Available. Supportingpartial failures without total system failure.Soft state. The state can be inconsistent for a given period of time.Eventual consistency. After some time all replicas will have consistent data.For a given accepted update and a given replica eventually either the update reaches the replica or the replica retires from service
  • http://labs.google.com/papers/bigtable.htmlhttp://labs.google.com/papers/gfs.htmlhttp://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf
  • NoSQL - what's that

    1. 1. NoSQL – What’s that?<br />SergejusBarinovas | Microsoft MVP<br />@sergejusb, sergejus.blogas.lt<br />
    2. 2. NoSQL<br />
    3. 3. WHY?<br />
    4. 4. <ul><li>Limited SQL scalability</li></ul>Horizontal partitioning (sharding)<br />Vertical partitioning<br />NoSQL – Why?<br />
    5. 5. <ul><li>Limited SQL availability</li></ul>Master / slave configuration<br />NoSQL – Why?<br />
    6. 6. <ul><li>SQL limitations for storing huge amount of data</li></ul>Key / value / type columns<br />NoSQL – Why?<br />
    7. 7. <ul><li>Limited SQL speed of read/write operations</li></ul>Multiple read replicas<br />NoSQL – Why?<br />
    8. 8. <ul><li>2009, Eric Evans
    9. 9. NoSQL – open source distributed databases, not relational SQL databases
    10. 10. NoSQL – not only SQL
    11. 11. NoSQL->Big Data</li></ul>NoSQL History<br />
    12. 12. <ul><li>The ability to horizontally scale simple-operation throughput over many servers</li></ul>NoSQL Characteristics (scalability)<br />
    13. 13. <ul><li>A “weaker” concurrency model than the ACID transactions in most SQL systems</li></ul>NoSQL Characteristics (BASE)<br />
    14. 14. <ul><li>Efficient use of distributed indexes and RAM for data storage</li></ul>NoSQL Characteristics (distributed)<br />
    15. 15. <ul><li>The ability to dynamically define new attributes or data schema</li></ul>NoSQL Characteristics (schema-less)<br />
    16. 16. <ul><li>Atomicity – all or nothing
    17. 17. Consistency – state integrity
    18. 18. Isolation – no reads of uncommitted data
    19. 19. Durability – recover committed trans</li></ul>ACID (transactions)<br />
    20. 20. <ul><li>2000, Eric Brewer</li></ul>It is impossible for a distributed computer system to simultaneously provide all three of the following guarantees:<br /><ul><li>Consistency
    21. 21. Availability
    22. 22. Partition tolerance</li></ul>CAP Theorem<br />
    23. 23. <ul><li>Basically – partial system failures are OKAvailable
    24. 24. Soft state – inconsistency is OK
    25. 25. Eventual consistency – stale data is OK </li></ul>BASE (eventual consistency)<br />
    26. 26.
    27. 27. NoSQL Databases<br />
    28. 28. <ul><li>Key / value store
    29. 29. Document database
    30. 30. Graph database
    31. 31. Columnar database</li></ul>NoSQL Categories<br />
    32. 32. <ul><li><key, value> or Tuple<key, v1,. ., vn>
    33. 33. Simple operations</li></ul>Get<br />Put<br />Delete<br />Key / value store<br />Key<br />Value<br />Byte[]<br />Byte[]<br />
    34. 34. Key / value store<br />Key<br />Value<br />“current_date”<br />2011.01.16<br />“sergejusb”<br />Binary Object<br />“sergejusb”<br />JSON Object<br />
    35. 35. <ul><li>Dynamo*
    36. 36. Membase
    37. 37. Voldermort
    38. 38. Redis
    39. 39. Azure Table Storage
    40. 40. Riak</li></ul>Key / value store<br />
    41. 41. Name: Dynamo<br />Created: 2007, Amazon (proprietary)<br />Implementation: ?<br />Distributed: Yes<br />Replication: Multiple Servers<br />CAP: AP<br />API: ?<br />Key / value store<br />
    42. 42. Name: Membase<br />Created: 2010, sponsored by Zinga<br />Implementation: C / C++ / Erlang<br />Distributed: Yes<br />Replication: Multiple Servers<br />CAP: CP<br />API: Memcached API, JSON<br />Key / value store<br />
    43. 43. Name: Voldemort<br />Created: 2008, LinkedIn<br />Implementation: Java<br />Distributed: Yes<br />Replication: Multiple Servers<br />CAP: AP<br />API: Java<br />Key / value store<br />
    44. 44. Name: Redis<br />Created: 2009, sponsored by VMWare<br />Implementation: C<br />Distributed: No<br />Replication: Master / Slave<br />CAP: CP<br />API: Various Languages<br />Key / value store<br />
    45. 45. Name: Azure Table Storage<br />Created: 2008, Microsoft<br />Implementation: ?<br />Distributed: Yes<br />Replication: Multiple Servers (DFS)<br />CAP: CP<br />API: .NET API, JSON<br />Key / value store<br />
    46. 46. Name: Riak<br />Created: 2008, Basho (from Akamai)<br />Implementation: Erlang<br />Distributed: Yes<br />Replication: Multiple Servers<br />CAP: AP<br />API: JSON<br />Key / value store<br />
    47. 47. <ul><li>Document == complex object</li></ul>XML<br />YAML<br />JSON / BSON<br /><ul><li>Support for secondary indexes
    48. 48. Schema can be defined at runtime
    49. 49. Optional support for simple querying using Map / Reduce</li></ul>Document database<br />
    50. 50. <ul><li>MongoDB
    51. 51. CouchDB
    52. 52. RavenDB</li></ul>Document database<br />
    53. 53. Name: MongoDB<br />Created: 2008, 10gen<br />Implementation: C++<br />Distributed: Yes via Shards<br />Replication: Master / Slave<br />CAP: CP<br />API: BSON<br />Document database<br />
    54. 54. Name: CouchDB<br />Created: 2005<br />Implementation: Erlang<br />Distributed: Sort of<br />Replication: Master / Master<br />CAP: AP<br />API: JSON<br />Document database<br />
    55. 55. Name: RavenDB<br />Created: 2010, AyendeRahien<br />Implementation: C#<br />Distributed: Yes via Shards<br />Replication: Master / Master<br />CAP: AP<br />API: .NET API, JSON<br />Document database<br />
    56. 56. <ul><li>Graph == network
    57. 57. Basic constructs</li></ul>Node<br />Edge<br />Properties<br />Graph database<br />sergejus.blogas.lt<br />reads<br />authors<br />knows<br />sergejus<br />tdagys<br />knows<br />
    58. 58. <ul><li>FlockDB
    59. 59. Neo4J</li></ul>Graph database<br />
    60. 60. Name: FlockDB<br />Created: 2010, Twitter<br />Implementation: Scala<br />Distributed: Yes<br />Replication: Multiple Servers<br />CAP: AP<br />API: Thrift, Ruby<br />Graph database<br />
    61. 61. Name: Neo4J<br />Created: 2003, NeoTechnologies<br />Implementation: Java<br />Distributed: No<br />Replication: Master / Slave<br />CAP: CP<br />API: JSON, Various Languages<br />Graph database<br />
    62. 62. <ul><li>For HUGE amount of data
    63. 63. Columns are added at a runtime
    64. 64. Great scalability </li></ul>Horizontal <br />Vertical<br />Columnar database<br />
    65. 65. <ul><li>Unusual data model</li></ul>Key Space == Database<br />Column Family == Table<br />Columns and Super Columns<br />Super Column == array of Columns<br />Column == Tuple<Key, Value, Timestamp, TTL><br />Columnar database<br />
    66. 66. Columnar database<br /><ul><li>Simple Column</li></li></ul><li>Columnar database<br /><ul><li>Super Column</li></li></ul><li><ul><li>BigTable*
    67. 67. Cassandra
    68. 68. HBase
    69. 69. Hypertable</li></ul>Columnar database<br />
    70. 70. Name: BigTable<br />Created: 2006, Google<br />Implementation: C++<br />Distributed: Yes<br />Replication: Multiple Servers (GFS)<br />CAP: CP<br />API: C++<br />Columnar database<br />
    71. 71. Name: Cassandra<br />Created: 2008, Facebook<br />Implementation: Java<br />Distributed: Yes<br />Replication: Multiple Servers<br />CAP: AP<br />API: Thrift, Avro<br />Columnar database<br />
    72. 72. Name: HBase<br />Created: 2007, Powerset<br />Implementation: Java<br />Distributed: Yes<br />Replication: Multiple Servers (HDFS)<br />CAP: CP<br />API: Thrift, Java, JSON<br />Columnar database<br />
    73. 73. Name: Hypertable<br />Created: 2007, Zvents<br />Implementation: C<br />Distributed: Yes<br />Replication: Multiple Servers<br />CAP: CP<br />API: Thrift<br />Columnar database<br />
    74. 74. <ul><li>ORDER BY ?</li></ul>“Natural Key Order”<br />NoSQL Limitations<br />
    75. 75. <ul><li>GROUP BY ?</li></ul>Map / Reduce<br />NoSQL Limitations<br />
    76. 76. <ul><li>JOIN ?</li></ul>Multiple Map / Reduce<br />NoSQL Limitations<br />
    77. 77. <ul><li>SELECT * ?</li></ul>Multi-Machine Map / Reduce<br />NoSQL Limitations<br />
    78. 78. <ul><li>Maturity
    79. 79. Tooling
    80. 80. Specificity</li></ul>NoSQL Limitations<br />
    81. 81. <ul><li>Choose the right tool for the task
    82. 82. You can use BOTH</li></ul>SQL vs. NoSQL<br />
    83. 83. Q & A<br />
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×