0
NoSQL like there is NoTomorrow 
Khawaja 
Engineering Lead for NoSQL, AWS 
@swami_79 @ksshams
NASA JPL has visited every planet in the 
@ksshams 
solar system ... except Pluto
25 Gbps!
@ksshams
let’s start with a trilogy … 
@swami_79 @ksshams
episode 1 
once upon a time... 
(in 2000) 
@swami_79 @ksshams
a half mile away... 
(Seattle) 
@swami_79 @ksshams
amazon.com - a rapidly growing Internet based 
retail business relied on relational databases 
@swami_79 @ksshams
we had 1000s of independent services 
@swami_79 @ksshams
each service managed its own state in 
Relational Databases 
@swami_79 @ksshams
Relational Databases 
are pretty seductive 
@swami_79 @ksshams
first of all... SQL!! 
@swami_79 @ksshams
so it is easier to query.. 
@swami_79 @ksshams
easier to learn 
@swami_79 @ksshams
They are as versatile as a swiss army knife 
key-value access complex queries 
analytics transactions 
@swami_79 @ksshams
Relational Databases 
are very similar to 
Swiss Army Knives 
@swami_79 @ksshams
sometimes.. swiss army knifes.. 
can be more than what you bargained for 
@swami_79 @ksshams
partitioning 
easy 
re-partitioning 
HARD.. 
@swami_79 @ksshams
so we bought 
bigger boxes... 
@swami_79 @ksshams
benchmark 
new hardware 
migrate to new 
hardware 
Q4 was hard-work at Amazon 
repartition 
databases 
pray 
... 
@swami_7...
Relational Databases 
have availability challenges.. 
@swami_79 @ksshams
episode 2 
then.. (in 2005) 
@swami_79 @ksshams
amazon dynamo 
predecessor to dynamoDB 
replicated DHT with consistent hashing 
optimistic replication 
“sloppy quorum” 
a...
dynamo had many benefits 
• higher availability 
• we traded it off for consistency 
• incremental scalability 
• no more ...
but dynamo was not perfect... 
lacked strong consistency 
@swami_79 @ksshams
but dynamo was not perfect... 
scaling was easier, but... 
@swami_79 @ksshams
but dynamo was not perfect... 
steep learning curve 
@swami_79 @ksshams
but dynamo was not perfect... 
dynamo was a library ==>> not a service... 
@swami_79 @ksshams
episode 3 
then.. (in 2012) 
@swami_79 @ksshams
Managed NoSQL Database 
ADMIN 
DynamoDB 
Built for Scale 
Fast & Predictable Performance 
@swami_79 @ksshams
DynamoDB 
“Even though we have years of experience with large, 
complex NoSQL architectures, we are happy to be finally 
o...
DynamoDB Goals and Philosophies 
durability and availability 
scale is our problem 
easy to use 
scale in rps 
consistent ...
durability is key… 
@swami_79 @ksshams
availability is key… 
@swami_79 @ksshams
scale is our problem, not yours.. 
@swami_79 @ksshams
Fault Tolerant Design 
Infrastructure Fails - deal with it! 
Planning for failures is not easy 
How do you ensure your rec...
Byzantine General Problem 
@swami_79 @ksshams
A simple 2-way replication system of a traditional database… 
Writes 
@swami_79 @ksshams
S is dead, 
need to trigger 
new replica 
P is dead, need to 
promote myself 
P S 
@swami_79 @ksshams
Improved Replication: Quorum 
Writes 
Quorum: Successful write on a majority 
@swami_79 @ksshams
Easy? 
Writes from 
client X 
Reads and 
Writes from 
client Y 
Classic Split Brain Issue in Replicated systems leading to...
Building correct distributed systems is not straight forward.. 
Handle replica failures 
Handle partial failures of replic...
Trends in the World Of Databases 
@swami_79 @ksshams
A decade ago, it was all about the DBAs 
@swami_79 @ksshams
Last 5 years have been about self service. 
@swami_79 @ksshams
Today is about managed services. 
@swami_79 @ksshams
Plan for Success … Plan for Scale 
@swami_79 @ksshams
@swami_79 @ksshams
@swami_79 @ksshams
NoSQL like there is No Tomorrow
NoSQL like there is No Tomorrow
NoSQL like there is No Tomorrow
Upcoming SlideShare
Loading in...5
×

NoSQL like there is No Tomorrow

255

Published on

The AWS NoSQL team shares the design philosophy behind DynamoDB and lessons learned in building for massive scale.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
255
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • This story will be a little biased because Swami and I are from the NoSQL generation and we really don’t like relational databases
  • http://centerfordentalimplants.info/wp-content/uploads/2012/10/young-woman-holding-cheeks-in-surprise.jpg
  • more people already know it, etc
  • http://www.toolstop.co.uk/components/com_virtuemart/shop_image/product/399926c941f80748cd41a85a57f8c163.jpg
  • http://www.toolstop.co.uk/components/com_virtuemart/shop_image/product/399926c941f80748cd41a85a57f8c163.jpg Swami
  • http://www.swissknifeshop.com/media/catalog/product/cache/1/image/5e06319eda06f020e43594a9c230972d/W/G/WG16999.jpg
    They seriously don’t scale well
    and if you’re not careful and try to use all the features at once, you can get cut in the process
  • it was like playing minesweeper
  • RDBMs focus way too much on consistency - and compromise availability
    And seriously - are they really all that consistent? ever had slaves fall behind? or how about a recent bug we had where all slaves got updates, but the master didn’t take the update
    Like Swiss army knives - there is waaaay too much functionality in the knives. Let’s move it to the app (easier to debug and change).
    Developers often use transactions when they shouldn’t - very annoying - it is like bringing a swiss army knife through a TSA checkpoint. Just foolish
  • Khawaja
  • Swami:
  • K:
  • required action from developers
    Q4 scaling: While not as scary as phase 1, still was work
    Need benchmarking, patching, adding hardware, rebalancing clusters..
  • developers still had to learn to set it up
    if you are building a recommendation system, you want to be learning k-clustering and collaborative filtering algorithms rather than focusing on lamport’s papers
  • Swami:

    like s3 - and it required developers to be admins.

    this is the really key point - our developers wanted a service, not a product. If you look at mongodb, that is what they are doing today.
  • K
  • </K>
  • 2 dc writes
    on disk
    continuous replication
    we should brag about availability
  • multi-region service!
    Regional service
    spans multiple availability zones
    All data is continuously replicated to multiple AZ’s
  • plan for success, plan for scalability
  • Transcript of "NoSQL like there is No Tomorrow"

    1. 1. NoSQL like there is NoTomorrow Khawaja Engineering Lead for NoSQL, AWS @swami_79 @ksshams
    2. 2. NASA JPL has visited every planet in the @ksshams solar system ... except Pluto
    3. 3. 25 Gbps!
    4. 4. @ksshams
    5. 5. let’s start with a trilogy … @swami_79 @ksshams
    6. 6. episode 1 once upon a time... (in 2000) @swami_79 @ksshams
    7. 7. a half mile away... (Seattle) @swami_79 @ksshams
    8. 8. amazon.com - a rapidly growing Internet based retail business relied on relational databases @swami_79 @ksshams
    9. 9. we had 1000s of independent services @swami_79 @ksshams
    10. 10. each service managed its own state in Relational Databases @swami_79 @ksshams
    11. 11. Relational Databases are pretty seductive @swami_79 @ksshams
    12. 12. first of all... SQL!! @swami_79 @ksshams
    13. 13. so it is easier to query.. @swami_79 @ksshams
    14. 14. easier to learn @swami_79 @ksshams
    15. 15. They are as versatile as a swiss army knife key-value access complex queries analytics transactions @swami_79 @ksshams
    16. 16. Relational Databases are very similar to Swiss Army Knives @swami_79 @ksshams
    17. 17. sometimes.. swiss army knifes.. can be more than what you bargained for @swami_79 @ksshams
    18. 18. partitioning easy re-partitioning HARD.. @swami_79 @ksshams
    19. 19. so we bought bigger boxes... @swami_79 @ksshams
    20. 20. benchmark new hardware migrate to new hardware Q4 was hard-work at Amazon repartition databases pray ... @swami_79 @ksshams
    21. 21. Relational Databases have availability challenges.. @swami_79 @ksshams
    22. 22. episode 2 then.. (in 2005) @swami_79 @ksshams
    23. 23. amazon dynamo predecessor to dynamoDB replicated DHT with consistent hashing optimistic replication “sloppy quorum” anti-entropy mechanism object versioning specialist tool : •limited querying capabilities •simpler consistency @swami_79 @ksshams
    24. 24. dynamo had many benefits • higher availability • we traded it off for consistency • incremental scalability • no more repartitioning • no need to architect apps for peak • just add boxes • simpler querying model ==>> predictable performance @swami_79 @ksshams
    25. 25. but dynamo was not perfect... lacked strong consistency @swami_79 @ksshams
    26. 26. but dynamo was not perfect... scaling was easier, but... @swami_79 @ksshams
    27. 27. but dynamo was not perfect... steep learning curve @swami_79 @ksshams
    28. 28. but dynamo was not perfect... dynamo was a library ==>> not a service... @swami_79 @ksshams
    29. 29. episode 3 then.. (in 2012) @swami_79 @ksshams
    30. 30. Managed NoSQL Database ADMIN DynamoDB Built for Scale Fast & Predictable Performance @swami_79 @ksshams
    31. 31. DynamoDB “Even though we have years of experience with large, complex NoSQL architectures, we are happy to be finally out of the business of managing it ourselves.” - Don MacAskill, CEO @swami_79 @ksshams
    32. 32. DynamoDB Goals and Philosophies durability and availability scale is our problem easy to use scale in rps consistent and low latencies @swami_79 @ksshams
    33. 33. durability is key… @swami_79 @ksshams
    34. 34. availability is key… @swami_79 @ksshams
    35. 35. scale is our problem, not yours.. @swami_79 @ksshams
    36. 36. Fault Tolerant Design Infrastructure Fails - deal with it! Planning for failures is not easy How do you ensure your recovery strategies work correctly? @swami_79 @ksshams
    37. 37. Byzantine General Problem @swami_79 @ksshams
    38. 38. A simple 2-way replication system of a traditional database… Writes @swami_79 @ksshams
    39. 39. S is dead, need to trigger new replica P is dead, need to promote myself P S @swami_79 @ksshams
    40. 40. Improved Replication: Quorum Writes Quorum: Successful write on a majority @swami_79 @ksshams
    41. 41. Easy? Writes from client X Reads and Writes from client Y Classic Split Brain Issue in Replicated systems leading to lost writes! @swami_79 @ksshams
    42. 42. Building correct distributed systems is not straight forward.. Handle replica failures Handle partial failures of replicas Handle concurrent failures Ensure there isn’t a parallel quorum @swami_79 @ksshams
    43. 43. Trends in the World Of Databases @swami_79 @ksshams
    44. 44. A decade ago, it was all about the DBAs @swami_79 @ksshams
    45. 45. Last 5 years have been about self service. @swami_79 @ksshams
    46. 46. Today is about managed services. @swami_79 @ksshams
    47. 47. Plan for Success … Plan for Scale @swami_79 @ksshams
    48. 48. @swami_79 @ksshams
    49. 49. @swami_79 @ksshams
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×