Your SlideShare is downloading. ×
NoSQL Revolution: Under the Covers of Distributed Systems at Scale (SPOT401) | AWS re:Invent 2013
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

NoSQL Revolution: Under the Covers of Distributed Systems at Scale (SPOT401) | AWS re:Invent 2013


Published on

The Dynamo paper started a revolution in distributed systems. The contributions from this paper are still impacting the design and practices of some of the world's largest distributed systems, …

The Dynamo paper started a revolution in distributed systems. The contributions from this paper are still impacting the design and practices of some of the world's largest distributed systems, including those at and beyond. Building distributed systems is hard, but our goal in this session is to simplify the complexity of this topic to empower the hacker in you! Have you been bitten by the eventual consistency bug lately? We show you how to tame eventual consistency and make it a great scaling asset. As you scale up, you must be ready to deal with node, rack, and data center failure. We share insights on how to limit the blast radius of the individual components of your system, battle tested techniques for simulating failures (network partitions, data center failure), and how we used core distributed systems fundamentals to build highly scalable, performance, durable, and resilient systems. Come watch us uncover the secret sauce behind Amazon DynamoDB, Amazon SQS, Amazon SNS, and the fundamental tenents that define them as Internet scale services. To turn this session into a hacker's dream, we go over design and implementation practices you can follow to build an application with virtually limitless scalability on AWS within an hour. We even share insights and secret tips on how to make the most out of one of the services released during the morning keynote.

Published in: Technology
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. SPOT 401 - Leading the NoSQL Revolution: under the covers of Distributed Systems @ scale @swami_79 @ksshams
  • 2. what are we covering? The evolution of large scale distributed systems @ Amazon from the 90’s to today The lessons we learned and insights you can employ in your own distributed systems @swami_79 @ksshams
  • 3. let’s start with a story about a little company called @swami_79 @ksshams
  • 4. episode 1 once upon a time... (in 2000) @swami_79 @ksshams
  • 5. a thousand miles away... (seattle) @swami_79 @ksshams
  • 6. - a rapidly growing Internet based retail business relied on relational databases @swami_79 @ksshams
  • 7. we had 1000s of independent services @swami_79 @ksshams
  • 8. each service managed its own state in RDBMs @swami_79 @ksshams
  • 9. RDBMs are actually kind of cool @swami_79 @ksshams
  • 10. first of all... SQL!! @swami_79 @ksshams
  • 11. so it is easier to query.. @swami_79 @ksshams
  • 12. easier to learn @swami_79 @ksshams
  • 13. they are as versatile as a swiss army knife complex queries key-value access analytics transactions @swami_79 @ksshams
  • 14. RDBMs are *very* similar to Swiss Army Knives @swami_79 @ksshams
  • 15. but sometimes.. swiss army knifes.. can be more than what you bargained for @swami_79 @ksshams
  • 16. repartitioning HARD.. partitioning easy @swami_79 @ksshams
  • 17. so we bought bigger boxes... @swami_79 @ksshams
  • 18. benchmark new hardware migrate to new hardware Q4 was hard-work at Amazon repartition databases pray ... @swami_79 @ksshams
  • 19. RDBMs availability challenges.. @swami_79 @ksshams
  • 20. episode 2 then.. (in 2005) @swami_79 @ksshams
  • 21. amazon dynamo predecessor to dynamoDB replicated DHT with consistent hashing optimistic replication “sloppy quorum” anti-entropy mechanism object versioning specialist tool : •limited querying capabilities •simpler consistency @swami_79 @ksshams
  • 22. dynamo had many benefits • higher availability • we traded it off for eventual consistency • • • • incremental scalability no more repartitioning no need to architect apps for peak just add boxes • simpler querying model ==>> predictable performance @swami_79 @ksshams
  • 23. but dynamo was not perfect... lacked strong consistency @swami_79 @ksshams
  • 24. but dynamo was not perfect... scaling was easier, but... @swami_79 @ksshams
  • 25. but dynamo was not perfect... steep learning curve @swami_79 @ksshams
  • 26. but dynamo was not perfect... dynamo was a product ... ==>> not a service... @swami_79 @ksshams
  • 27. episode 3 then.. (in 2012) @swami_79 @ksshams
  • 28. DynamoDB • NoSQL database • fast & predictable performance • seamless scalability • easy administration ADMIN “Even though we have years of experience with large, complex NoSQL architectures, we are happy to be finally out of the business of managing it ourselves.” - Don MacAskill, CEO @swami_79 @ksshams
  • 29. services, services, services @swami_79 @ksshams
  • 30.’s experience with services @swami_79 @ksshams
  • 31. how do you create a successful service? @swami_79 @ksshams
  • 32. with great services, comes great responsibility @swami_79 @ksshams
  • 33. Architect Customer @swami_79 @ksshams
  • 34. DynamoDB Goals and Philosophies never compromise on scale is our durability problem easy to use consistent and low scale in rps latencies @swami_79 @ksshams
  • 35. how to build these large scale services? @swami_79 @ksshams
  • 36. don’t compromise on durability… @swami_79 @ksshams
  • 37. don’t compromise on… availability @swami_79 @ksshams
  • 38. plan for success, plan for scalability @swami_79 @ksshams
  • 39. @swami_79 @ksshams
  • 40. Fault tolerant design is key.. • Everything fails all the time • Planning for failures is not easy • How do you ensure your recovery strategies work correctly? @swami_79 @ksshams
  • 41. Byzantine General Problem @swami_79 @ksshams
  • 42. A simple 2-way replication system of a traditional database… Writes Primary Standby @swami_79 @ksshams
  • 43. P is dead, need to promote myself S is dead, need to trigger new replica P P’ S @swami_79 @ksshams
  • 44. Improved Replication: Quorum Replica Replica Writes Replica Quorum: Successful write on a majority @swami_79 @ksshams
  • 45. Not so easy.. New member in the group Replica D Replica A Replica B Reads and Writes from client B Replica C Should I continue to serve reads? Should I start a new quorum? Replica E Writes from client A Replica F Classic Split Brain Issue in Replicated systems leading to lost writes!
  • 46. Building correct distributed systems is not straight forward.. • How do you handle replica failures? • How do you ensure there is not a parallel quorum? • How do you handle partial failures of replicas? • How do you handle concurrent failures? @swami_79 @ksshams
  • 47. correctness is hard, but necessary
  • 48. Formal Methods
  • 49. Formal Methods to minimize bugs, we must have a precise description of the design
  • 50. Formal Methods code is too detailed design documents and diagrams are vague & imprecise how would you express partial failures or concurrency?
  • 51. Formal Methods law of large numbers is your friend, until you hit large numbers so design for scale
  • 52. TLA+ to the rescue? @swami_79 @ksshams
  • 53. PlusCal @swami_79 @ksshams
  • 54. formal methods are necessary but not sufficient.. @swami_79 @ksshams
  • 55. customer @swami_79 @ksshams
  • 56. don’t forget to test - no, serious ly @swami_79 @ksshams
  • 57. simulate failures at unit test level fault injection testing scale testing embrace failure and don’t be surprised datacenter testing network brown out testing
  • 58. testing is a lifelong journey
  • 59. testing is necessary but not sufficient.. @swami_79 @ksshams
  • 60. Customer Architect @swami_79 @ksshams
  • 61. gamma simulate real world one box does it work? release cycle phased deployment treading lightly monitor does it still work? @swami_79 @ksshams
  • 62. Monitor customer behavior @swami_79 @ksshams
  • 63. measuring customer experience is key don’t be satisfied by average - look at 99 percentile @swami_79 @ksshams
  • 64. understand the scaling dimensions @swami_79 @ksshams
  • 65. understand how your service will be abused @swami_79 @ksshams
  • 66. let’s see these rules in action through a true story @swami_79 @ksshams
  • 67. we were building distributed systems all over @swami_79 @ksshams
  • 68. we needed a uniform and correct way to do consensus.. @swami_79 @ksshams
  • 69. service so we built a paxos lock library @swami_79 @ksshams
  • 70. such a service is so much more useful than just leader election.. it became a distributed state store @swami_79 @ksshams
  • 71. such a service is so much more useful than just leader election.. or a distributed state store wait wait.. you’re telling me if I poll, I can detect node failure? @swami_79 @ksshams
  • 72. we acted quickly - and scaled up our entire fleet with more nodes doh!!!! we slowed consensus... @swami_79 @ksshams
  • 73. understand the scaling dimensions & scale them independently... @swami_79 @ksshams
  • 74. a lock service has 3 components.. State Store @swami_79 @ksshams
  • 75. they must be scaled independently.. State Store @swami_79 @ksshams
  • 76. they must be scaled independently.. State Store @swami_79 @ksshams
  • 77. they must be scaled independently.. State Store @swami_79 @ksshams
  • 78. Let’s Go Over The demo from this morning
  • 79. stream ingestion
  • 80. stream ingestion
  • 81. stream ingestion
  • 82. Real-time tweet analytics using DynamoDB • Stream from Kinesis to DynamoDB • What data do want in real-time? • (per-second, top words) • How does DynamoDB help? • Atomic counters (per-word counts in that second) • Indexed queries (top N word-counts in that second
  • 83. WordCount Table Local Secondary Index Time Word Count Time Count Word 2013-10-13T12:00 2013-10-13T12:00 2013-10-13T12:00 2013-10-13T12:03 Earth Mars Pluto Earth 9 10 5 8 2013-10-13T12:00 2013-10-13T12:00 2013-10-13T12:00 2013-10-13T12:03 5 9 10 8 Pluto Earth Mars Earth
  • 84. DynamoDB cost: $0.25 / hr
  • 85. stream ingestion
  • 86. Aggregate queries using Redshift • Simple Redshift connector (buffer files, store in s3, call copy command) • Manifest copy connector • 2 streams • transaction table for deduplication • manifest copy
  • 87. Right tool for right job… • Canal -> DynamoDB -> Redshift -> Glacier…
  • 88. You are not done yet.. • Listen to customer feedback • Iterate..
  • 89. Example: DynamoDB • Start with immediate needs of reliable, super scalable, low latency datastore • Iterate • Developers wanted flexible query: Local Secondary Indexes • Developers wanted parallel loads: Parallel Scans • Mobile developers wanted direct access to their datastore: Fine-grained Access Control • Mobile developers wanted geo-awareness: Geospatial library • Developers wanted DynamoDB on their laptop: DynamoDB Local • Developers wanted richer query: Global Secondary Indexes • We will continue to innovate..
  • 90. Sacred Tenets in Distributed Systems don’t compromise durability for performance plan for success – plan for scalability plan for failures - fault tolerance is key consistent performance is important release - think of blast radius insist on correctness @swami_79 @ksshams
  • 91. understand scaling dimensions observe how service is used monitor like a hawk relentlessly test scalability over features @swami_79 strive for correctness @ksshams
  • 92. Please give us your feedback on this presentation SPOT 401 Don’t miss SPOT 201!!! @swami_79 @ksshams
  • 93. @swami_79 @ksshams