Introduce Disclose work for Basho Working on Dynamo clone for the last couple of years
1. NoSQL Databases Jon Meredith [email_address]
2. What isn't NoSQL? <ul><li>NOT a standard.
3. NOT a product.
4. NOT a single technology. </li></ul>
5. Well, what is it? <ul>It's a buzzword . <li>A banner for non-relational databases to organize under.
6. Mostly created in response to scaling and reliability problems.
7. Huge differences between 'NoSQL' systems – but have elements in common. </li></ul>
8. Where did it come from? <ul><li>They've been around for a while </li><ul><li>Local key/value stores
9. Object databases
10. Graph databases
11. XML databases </li></ul><li>New problems are emerging </li><ul><li>Internet search
13. Social networking </li></ul></ul>
14. Where did it come from? <ul><li>Some efforts came from scaling the web...
15. Several papers published </li><ul><li>2006 – Google BigTable
16. 2007 – Dynamo Paper </li></ul><li>In 2008 - explosion of data storage projects
17. All shambling under the NoSQL banner. </li></ul>
18. Really, why not use RDBMs? <ul><li>I need to perform arbitrary queries
19. My application needs transactions
20. Data needs to be nicely normalized
21. I have replication for scalabilty/reliability </li></ul>
22. Data Mapping Woes <ul><li>Relational databases divide data into columns made up of tables.
23. Programmers use complex nested data structures </li><ul><li>Hashes
26. Things of things </li></ul><li>Have to map between the two </li></ul>
27. Data Mapping Woes (2) <ul><li>Data in systems evolve over time … which means changes to the schema.
28. Upgrade/rollback scripts have to operate on the whole database – could be millions of rows.
29. Doing phased rollouts is hard … the application needs to do work </li></ul>
30. Alternative! <ul><li>Let the application do it
31. Use convenient language features </li><ul><li>PHP serialize/unserialize </li></ul><li>… or use standards for mixed platforms </li><ul><li>JSON very popular and well supported
32. Google's protocol buffers
33. … even XML </li></ul><li>Design for forward compatibility </li><ul><li>Preserve unknown fields
34. Version objects </li></ul></ul>
35. Scalability and Availability <ul><li>Scalability </li><ul><li>How many requests you can process </li></ul><li>Availability </li><ul><li>How does your service degrade as things break. </li></ul><li>RDBMS solutions - replication and sharding </li></ul>
36. Scaling RDBMs - Replication <ul><li>Master-Slave replication is easiest
37. Every change on the master happens on the slave.
38. Slaves are read-only. Does not scale INSERT, UPDATE, DELETE queries.
39. Application responsible for distributing queries to correct server. </li></ul>
40. Scaling RDBMs - Replication <ul><li>Multi-master ring replication </li><ul><li>Can update any master
41. Updates travel around the ring
42. What happens when it fails? </li><ul><li>Reconfigure the ring </li></ul><li>What happens on return </li><ul><li>Synchronize the master
43. Add back in to the ring </li></ul></ul></ul>
44. Replication <ul><li>Replication is usually asynchronous for performance – you don't want to wait for the slowest slave on each update.
45. Replication takes time – there is time lag between the first and last server to see an update.
46. You may not read your writes – not getting aCid properties any more. </li></ul>
47. Scaling RDBMS – Sharding <ul><li>Do application level splitting of data </li><ul><li>Split large table into N smaller tables
48. Use Id modulo N to find the right table </li></ul><li>Tables could be spread across multiple database servers </li><ul><li>But the application needs to know where to query </li></ul></ul>
49. Availability <ul><li>If you want availability you need multiple servers – maybe even multiple sites.
50. In the real world you get network partitions </li><ul><li>Just because you can't see your other data center doesn't mean users can't. </li></ul><li>What should you do if you can't see the other data center? </li></ul>
51. Availability <ul><li>Degrade one site to read-only </li><ul><li>Defeats availability </li></ul><li>If you allow both sites to operate </li><ul><li>There's a chance two users could modify the same data.
52. The application needs to know how to resolve it </li></ul></ul>
53. The bottom line... <ul><li>Building systems that are </li><ul><li>...Scalable...
56. with an RDBMs requires large efforts by application developers and operational staff </li></ul></ul>
57. It's hard because... <ul><li>Significant work for developers. </li><ul><li>App needs to convert data to table/columns
58. App needs to know data location
59. App needs to handle failover
60. App needs to handle inconsistency </li></ul><li>Work for operational staff </li><ul><li>Fixing replication topologies and synchronizing servers is fiddly work. </li></ul></ul>
61. Last decades bleeding edge is here <ul><li>Organizations with big problems started experimenting with alternatives
62. Developed internal systems during the mid 2000s </li><ul><li>Distributed by design
63. Different data models </li></ul><li>Published details in 2006/2007 </li></ul>
64. Amazon <ul><li>Huge e-commerce vendor.
65. Amazon cares about customer experience </li><ul><li>Availabilty
66. Latency at the 99 th percentile </li></ul><li>Built as an SOA – pages built from hundreds of services.
67. Amazon runs multiple data centers. </li><ul><li>Hardware failure is their normal state
68. Network partitions common </li></ul></ul>
69. Amazon Requirements <ul><li>Shopping cart service must always be available
70. Customers should be able to view and add to their carts (in their words) </li><ul><li>If disks are failing
71. Network routes are flapping
72. Data centers are being destroyed by tornadoes </li></ul></ul>
73. Amazon Observations <ul><li>Many services just stored state. </li><ul><li>Access by primary key
74. No queries </li></ul><li>Examples </li><ul><li>Shopping carts
75. Best seller lists
76. Customer profiles </li></ul><li>Hard to scale out relational databases </li></ul>
77. Amazon Solution: Dynamo <ul><li>Primary key access only
78. Fault tolerant: Keeps N copies of the data
79. Designed for inconsistency
80. Totally decentralized – nodes 'gossip' state
81. Self-healing </li></ul>
82. Eventual Consistency 1 <ul><li>Brewer's CAP Theorem </li><ul><li>Consistency
84. Partition tolerance </li></ul><li>Pick two out of three!
85. Amazon chose A-P over C </li></ul>
86. Eventual Consistency 2 <ul><li>N copies of each value
87. Read operations (get) require 'R' nodes to respond
88. Write operations (put) require 'W' nodes to respond
89. If R+W > N nodes will read their writes (if no failure)
90. NRW tunes the cluster – typically (3,2,2) </li></ul>
91. Eventual Consistency 3 <ul><li>Consequence of availability: Conflicts
92. Conflicts can come from </li><ul><li>Network partitions
93. Applications themselves – no transactions or locking </li></ul><li>Applications must handle conflicts
94. Dynamo minimizes with vector clocks </li></ul>