Web Scale with NoSQL<br />Sergejus Barinovas(@sergejusb)<br />http://sergejus.blogas.lt<br />
Who Am I?<br />Architect at<br />Running NoSQL servers in production<br />Blogger (http://sergejus.blogas.lt, @sergejusb)<...
Powered by RDBMS<br />Used everywhere…<br />…even where it shouldn’t<br />Used for 30+ years!<br />
Back to 1980’s…<br />
Data boom<br />
in numbers<br />600 000 000 users<br />30 000 servers<br />20+ TB raw data per day<br />>20 PB stored data<br />
You really think they use RDBMS?<br />
RDBMS Scaling Example<br />
Simple usage<br />Customers<br />master<br />Reads / Writes<br />
Scale reads<br />Customers<br />master<br />Writes<br />Reads<br />slave<br />slave<br />
Scale writes<br />master<br />Reads / Writes [N-Z]<br />Customers [N-Z]<br />master<br />Customers [A-M]<br />Reads / Writ...
Scale reads / writes<br />slave<br />slave<br />Reads [A-M]<br />master<br />Customers [N-Z]<br />Writes [N-Z]<br />master...
Pray your system won’t fail<br />
Enter the NoSQL<br />
Why NoSQL<br />Limited SQL scalability<br />Sharding and vertical partitioning<br />Limited SQL availability<br />Master /...
NoSQLhistory<br />2009, Eric Evans, no:sql(est)<br />NoSQL– open source distributed databases, not relational SQL database...
NoSQL characteristics (1/2)<br />Scalability<br />The ability to horizontally scale simple-operation throughput over many ...
NoSQLcharacteristics (2/2)<br />Distributed<br />Efficient use of distributed indexes and RAM for data storage<br />Schema...
ACID (transactions)<br />Atomicity – all or nothing<br />Consistency – state integrity<br />Isolation – no reads of uncomm...
CAP theorem<br />2000, Eric Brewer<br />It is impossible for a distributed computer system to simultaneously provide all t...
BASE (eventualconsistency)<br />Basically – partial system failures are OKAvailable<br />Soft state – inconsistency is OK<...
NoSQL Databases<br />
NoSQLcategories<br />Key / value store<br />Document database<br />Graph database<br />Columnar database<br />
Key / valuestore<br /><key, value> or Tuple<key, v1,. ., vn><br />Simple operations<br />Get<br />Put<br />Delete<br />Key...
Key / valuestore<br />Key<br />Value<br />“current_date”<br />2011.04.04<br />“sergejusb”<br />Binary Object<br />“sergeju...
Key / value stores<br />Redis<br />(+)messaging<br />(-)no shards<br />Voldermort<br />Membase<br />(+)memcache interface<...
Document database<br />Document == complex object<br />XML<br />YAML<br />JSON / BSON<br />Support for secondary indexes<b...
Document databases<br />MongoDB<br />(+)shards<br />CouchDB<br />(+)master / master replication<br />
Graph database<br />Graph == network<br />Basic constructs<br />Node<br />Edge<br />Properties<br />sergejus.blogas.lt<br ...
Graph databases<br />Neo4j<br />(-)paid version required for scaling<br />FlockDB<br />(+)fast<br />(-)limited functionali...
Columnar database<br />For HUGE amount of data<br />Columns are added at a runtime<br />Great scalability <br />Horizontal...
Columnar database<br />Unusual data model<br />Key Space ->Database<br />Column Family -> Table<br />Columns and Super Col...
Columnar database<br />Simple column<br />
Columnar database<br />Simple column<br />
Columnar database<br />Cassandra<br />(+)easy scalable<br />HBase<br />(+)consistent<br />(+)part of Hadoop<br />Hypertabl...
NoSQL is Cool! But…<br />
NoSQL limitations<br />ORDER BY ?<br />Natural key order<br />GROUP BY ?<br />Map / Reduce*<br />JOIN ?<br />Multiple Map ...
NoSQLLimitations<br />Maturity<br />Tooling<br />Specificity<br />
Upcoming SlideShare
Loading in...5
×

Web Scale with NoSQL

2,595

Published on

Introduction to NoSQL,
RDBMS Scalability,
Why NoSQL,
Categories of NoSQL,
SQL vs NoSQL

Published in: Technology
1 Comment
4 Likes
Statistics
Notes
No Downloads
Views
Total Views
2,595
On Slideshare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
58
Comments
1
Likes
4
Embeds 0
No embeds

No notes for slide
  • Atomicity. All of the operations in the transaction will complete, or none will.Consistency. The database will be in a consistent state when the transaction begins and ends.Isolation. The transaction will behave as if it is the only operation being performed upon the database.Durability. Upon completion of the transaction, the operation will not be reversed.
  • Consistency. The client perceives that a set of operations has occurred all at once.Availability. Every operation must terminate in an intended response.Partition tolerance. Operations will complete, even if individual components are unavailable.http://www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf
  • Basically Available. Supportingpartial failures without total system failure.Soft state. The state can be inconsistent for a given period of time.Eventual consistency. After some time all replicas will have consistent data.For a given accepted update and a given replica eventually either the update reaches the replica or the replica retires from service
  • Web Scale with NoSQL

    1. 1.
    2. 2. Web Scale with NoSQL<br />Sergejus Barinovas(@sergejusb)<br />http://sergejus.blogas.lt<br />
    3. 3. Who Am I?<br />Architect at<br />Running NoSQL servers in production<br />Blogger (http://sergejus.blogas.lt, @sergejusb)<br />Community member (http://dotnetgroup.lt)<br />Contact me via sergejus.barinovas@gmail.com<br />
    4. 4. Powered by RDBMS<br />Used everywhere…<br />…even where it shouldn’t<br />Used for 30+ years!<br />
    5. 5. Back to 1980’s…<br />
    6. 6. Data boom<br />
    7. 7. in numbers<br />600 000 000 users<br />30 000 servers<br />20+ TB raw data per day<br />>20 PB stored data<br />
    8. 8. You really think they use RDBMS?<br />
    9. 9. RDBMS Scaling Example<br />
    10. 10. Simple usage<br />Customers<br />master<br />Reads / Writes<br />
    11. 11. Scale reads<br />Customers<br />master<br />Writes<br />Reads<br />slave<br />slave<br />
    12. 12. Scale writes<br />master<br />Reads / Writes [N-Z]<br />Customers [N-Z]<br />master<br />Customers [A-M]<br />Reads / Writes [A-M]<br />
    13. 13. Scale reads / writes<br />slave<br />slave<br />Reads [A-M]<br />master<br />Customers [N-Z]<br />Writes [N-Z]<br />master<br />Writes [A-M]<br />Customers [A-M]<br />Reads [A-M]<br />slave<br />slave<br />
    14. 14. Pray your system won’t fail<br />
    15. 15. Enter the NoSQL<br />
    16. 16. Why NoSQL<br />Limited SQL scalability<br />Sharding and vertical partitioning<br />Limited SQL availability<br />Master / slave configuration<br />Limited SQL speed of read operations<br />Multiple read replicas<br />SQL limitations for huge amount of data<br />Key / value / type columns<br />
    17. 17. NoSQLhistory<br />2009, Eric Evans, no:sql(est)<br />NoSQL– open source distributed databases, not relational SQL databases<br />NoSQL– not only SQL<br />NoSQL-> Big Data<br />
    18. 18. NoSQL characteristics (1/2)<br />Scalability<br />The ability to horizontally scale simple-operation throughput over many servers<br />BASE<br />A “weaker” concurrency model than the ACID transactions in most SQL systems<br />
    19. 19. NoSQLcharacteristics (2/2)<br />Distributed<br />Efficient use of distributed indexes and RAM for data storage<br />Schema-less<br />The ability to dynamically define new attributes or data schema<br />
    20. 20. ACID (transactions)<br />Atomicity – all or nothing<br />Consistency – state integrity<br />Isolation – no reads of uncommitted data<br />Durability – recover committed trans<br />
    21. 21. CAP theorem<br />2000, Eric Brewer<br />It is impossible for a distributed computer system to simultaneously provide all three of the following guarantees:<br />Consistency<br />Availability<br />Partition tolerance<br />
    22. 22. BASE (eventualconsistency)<br />Basically – partial system failures are OKAvailable<br />Soft state – inconsistency is OK<br />Eventual consistency – stale data is OK<br />
    23. 23.
    24. 24. NoSQL Databases<br />
    25. 25. NoSQLcategories<br />Key / value store<br />Document database<br />Graph database<br />Columnar database<br />
    26. 26. Key / valuestore<br /><key, value> or Tuple<key, v1,. ., vn><br />Simple operations<br />Get<br />Put<br />Delete<br />Key<br />Value<br />Byte[]<br />Byte[]<br />
    27. 27. Key / valuestore<br />Key<br />Value<br />“current_date”<br />2011.04.04<br />“sergejusb”<br />Binary Object<br />“sergejusb”<br />JSON Object<br />
    28. 28. Key / value stores<br />Redis<br />(+)messaging<br />(-)no shards<br />Voldermort<br />Membase<br />(+)memcache interface<br />Riak<br />
    29. 29. Document database<br />Document == complex object<br />XML<br />YAML<br />JSON / BSON<br />Support for secondary indexes<br />Schema can be defined at runtime<br />Optional support for simple querying using Map / Reduce<br />
    30. 30. Document databases<br />MongoDB<br />(+)shards<br />CouchDB<br />(+)master / master replication<br />
    31. 31. Graph database<br />Graph == network<br />Basic constructs<br />Node<br />Edge<br />Properties<br />sergejus.blogas.lt<br />reads<br />authors<br />knows<br />sergejus<br />tdagys<br />knows<br />
    32. 32. Graph databases<br />Neo4j<br />(-)paid version required for scaling<br />FlockDB<br />(+)fast<br />(-)limited functionality<br />
    33. 33. Columnar database<br />For HUGE amount of data<br />Columns are added at a runtime<br />Great scalability <br />Horizontal <br />Vertical<br />
    34. 34. Columnar database<br />Unusual data model<br />Key Space ->Database<br />Column Family -> Table<br />Columns and Super Columns<br />Super Column -> array of Columns<br />Column -> Tuple<Key, Value, Timestamp, TTL><br />
    35. 35. Columnar database<br />Simple column<br />
    36. 36. Columnar database<br />Simple column<br />
    37. 37. Columnar database<br />Cassandra<br />(+)easy scalable<br />HBase<br />(+)consistent<br />(+)part of Hadoop<br />Hypertable<br />
    38. 38. NoSQL is Cool! But…<br />
    39. 39.
    40. 40. NoSQL limitations<br />ORDER BY ?<br />Natural key order<br />GROUP BY ?<br />Map / Reduce*<br />JOIN ?<br />Multiple Map / Reduce*<br />SELECT * ?<br />Multi-machine Map / Reduce*<br />*if possible<br />
    41. 41. NoSQLLimitations<br />Maturity<br />Tooling<br />Specificity<br />
    42. 42. SQL vs. NoSQL<br />Choose the right tool for the task<br />You can use BOTH<br />
    43. 43. Thank you!<br />Sergejus Barinovas (@sergejusb)<br />sergejus.barinovas@gmail.com<br />http://sergejus.blogas.lt<br />
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×