• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Dynamo Systems - QCon SF 2012 Presentation

Dynamo Systems - QCon SF 2012 Presentation



A look at Dynamo-based systems: the architectural principles, use cases and requirements; where they differ from relational databases; and where they are going.

A look at Dynamo-based systems: the architectural principles, use cases and requirements; where they differ from relational databases; and where they are going.



Total Views
Views on SlideShare
Embed Views



1 Embed 99

https://twitter.com 99



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment
  • Twitter.com/shanley, shanley@basho.com
  • Basho makes a highly available, open source, distributed database called Riak that is based on the principles of the Dynamo paper.
  • The Dynamo paper was published about 5 years ago and outlines a set of properties for a highly available key/value store that provides an “always-on” experience for end users. Dynamo is an internal service at Amazon that power parts of AWS, including S3, and the Amazon.com shopping cart.
  • As the Dynamo paper discusses, a single page request to Amazon can result in requests to over 150 services, many of which have their own dependencies. High availability and low-latency must be created across the entire service-oriented architecture to present these properties in the user-facing experience.
  • Principles from the Dynamo paper have spawned many open-source and commercial projects including Cassandra (came out of Facebook, commercial players include Datastax and Acunu) and Project Voldemort (an open source project coming out of LinkedIn). Basho, launched in 2009, is based on many of Dynamo’s principles for distributed systems and extends the core functionality with full-text search, multi-datacenter replication, MapReduce and other features.
  • The Amazon shopping cart is a canonical use case of an application requiring the high availability and low-latency that Dynamo-based systems provide. Amazon itself serves as a classic example of both a new class of website/application with a different set of business requirements, as well as a new breed of infrastructure provided by its Web Services platform. For both scenarios, unavailability has a direct impact on revenue, as well as on the user trust required for a healthy business.
  • Beyond availability and low-latency, the Dynamo paper cites a number of other requirements for its data store. These are important design considerations that profoundly affect the implementation of the database.
  • In the past, these problems might have been addressed with a big server and some MySQL. Why hasn’t Amazon taken that route?
  • MySQL systems are consistent. Consistent systems will not be write available during failure conditions, focusing on correctness over serving requests. Ensuring write-availability is a critical aspect of the Dynamo design.
  • CAP theorem states that in the event of network partition, systems must choose consistency or availability – not bolth.
  • However, CAP is only one of many considerations in database design. We are in the early stages of databases that are designed to handle new types of operational environments and application needs / constraints.
  • As the Dynamo paper shows, many other factors play a role – including the expense of hardware and people to maintain the system, and the need for a scale-out model.
  • Indeed, even in database designs like Google’s Spanner (in contrast, a consistent distributed database), some of the primary concerns beyond CAP are the ability to scale to multiple datacenters and the operational cost of data growth.
  • Perhaps it’s time for a new theorem!
  • Amazon’s application requirements in creating Dynamo.
  • Amazon’s definition of developer productivity in relation to Dynamo.
  • Amazon’s operational requirements.
  • Turns out, Amazon’s need were echoing trends going on in the larger world. The big shift is not all about CAP – it’s about the big shift to distributed systems. Distributed systems aren’t just about DNS and CDN anymore. The business needs that require a distributed system are relevant to more and more companies. As a result, you’re seeing more and more “old” ideas (databases, storage, monitoring) being re-architected and re-thought for a new, distributed environment.
  • How requirements for applications have changed.
  • How the way we define developer productivity has changed.
  • How the operational environment we are building in has changed.
  • These are the other hard problems in distributed systems that touch CAP but also speak beyond it – areas the Dynamo paper posits approaches to. Dynamo can really be seen as a collection of technologies and approaches that produce the desired properties of the database… all interconnected in the circle of life. Let’s take a look at how these areas have been handled in a relational world vs how Dynamo addresses them.
  • In a relational world, you might have thrown all your data (let’s say you’re building a database of bunny pictures and information) on one big machine.
  • As you grew, you might shard the data across multiple machines through some logical division of data.
  • Sharding, however, can lead to hot spots in the database. Also, writing and maintaining sharding logic increases the overhead of operating and developing on the database. Significant growth of data or traffic typically means significant, often manual resharding projects. Figuring out how to intelligently split the dataset without negative impacts on performance, operations and development presents a significant challenge.
  • Two of Dynamo’s goals were to reduce hot spots in the database, and reduce the operational burden of scale.
  • To accomplish this, Dynamo uses consistent hashing, a technique first innovated at Akamai. In consistent hashing, you start with a 160-bit integer space…
  • And divide it into equally sized partitions that form a range of possible values.
  • Keys are hashed onto the integer space. Keys are “owned” by the partition they are hashed onto, and partitions, or virtual nodes, are claimed by physical nodes in the cluster. The even distribution of data created by the hashing function, and that fact that physical nodes share responsibility for partitions, results in a cluster that shares the load.
  • In a relational world, scaling up might mean getting bigger machines…
  • Or re-sharding your sharding scheme.
  • At massive scale though, this might land you in sharding hell.
  • You might note that the word “shard” appears zero times in the Dynamo paper.
  • When new nodes are added to a Dynamo cluster, a joining node takes over partitions until responsibility is again equal. Existing nodes handoff data for the appropriate key spaces. Cluster state is shared through “gossip protocol” and nodes update their view of the cluster as it changes.
  • This is one of the major innovations in the Dynamo paper. Decoupling the partitioning scheme from how partitions are assigned to physical nodes means that data load can be evenly shared by the cluster and adding nodes doesn’t requiring manually figuring out where to put data.
  • Data in a relational database is generally replicated using a master/slave setup to ensure consistency.
  • Writes are applied through a master node, which ensures that the writes are applied consistently to both master and slave nodes.
  • Reads can occur at slave or master nodes, but writes must go through the master.
  • This can be problematic in the event of master failure.
  • It can take time for slaves nodes to realize the master is no longer available and to elect a new master.
  • In the duration, that data will be unavailable to writes and updates.
  • Dynamo paper states that there is no master that can cause write unavailability, and that all nodes are equal.
  • In a Dynamo-based system, all nodes responsible for a replica can serve read and write requests. The system uses eventual consistency – as opposed to all replicas seeing an update at the same time, all replicas will see updates eventually – in practice, usually a very small window until all replicas reflect the most recent value.
  • Dynamo systems also provide w and r values on requests to maintain availability despite failure, and to allow the developer to tune to some extent the “correctness” of reads and writes.
  • Lower w and r values produce high availability and lower latency.
  • With a w or r value equal to the number of replicas, higher correctness is possible.
  • In any system that uses an eventually consistent model and replicates data, you run the risk of divergent data or “entropy”. Two common scenarios for this are when not all writes have propagated to all nodes…
  • Or when different clients update the same datum concurrently.
  • The solution… looks like this.
  • A vector clock is a piece of metadata attached to each object.
  • It gives the developer a choice: to resolve a conflict at the data store level (“last write wins”), or let the client resolve it with business logic relevant to the use case.
  • Dynamo also provides some nice anti-entropy features, including read repair.
  • Dynamo also has a slick way to maintain write availability during node failure.
  • If a node becomes unavailable due to hardware failure or network partition, writes/updates for that node will go to a fallback.
  • This neighboring node will “hand off” data when the original node returns to the cluster.
  • Another significant point is how Dynamo changes application development and how we define developer productivity.
  • There is now more unstructured data in the world, more apps that don’t require a strong schema. Using a “schema-less” key/value data model, you can eliminate some of the need for extensive data model “pre-planning”, and change applications or develop new features without changing the underlying data model. It’s simpler and for some applications that fit a key/value model, more productive.
  • A look at common apps and use cases for Dynamo-like systems and simple approaches to building data on them with a key/value scheme…
  • What does the future of NoSQL hold?
  • Many “ NoSQL ” systems have thrown out consistency in favor of availability…
  • But maybe we also threw out a lot of useful things in the process. What about higher-level data types, search, aggregation tasks, and the developer-friendly things we love in relational databases ? We’ve done a lot so far to address some of the underlying architectural requirements of new apps and platforms, but there is more we can do to enable greater queriability and broader use cases.
  • There is growing research and practice around ways we can offer more data types on top of distributed systems, data types that are tolerant of an eventually consistent design.
  • At Riak, we are applying this research to offer more advanced data types like counters and sets in a future release, and some implementations of this can already be seen in the wild.
  • And what about consistency? The author of CAP Theorem has stated that CAP doesn’t forbid CA during normal operations…
  • So can we create systems that can offer both availability and consistency or offer greater guarantees around consistency? Like offering conditional writes - failing a write if intervening requests occur, or if they don’t meet some other requirement? Or offering Paxos-like implementations to ensure strict ordering among replicas? What about consistent reads that always reflect the last written value?
  • The problem with offering other advanced features – like Search, MapReduce, storing terms and indexes - in distributed systems like Dynamo is that data is ALL AROUND THE CLUSTER… these tasks require finding data and performing tasks on many different nodes in the system.
  • Oftentimes, the response to date has been to run secondary clusters. However, this requires cluster replication and carries a higher operational burden.
  • What if we find ways to see data locality / even distribution as more of a spectrum, allowing us to perform advanced tasks without degrading performance or requiring secondary clusters?
  • Biology tells us that
  • Rapid emergence of new species and variations
  • Caused by significant events (like the Dynamo paper, the explosion of commodity hardware, cloud computing, mobile/social revolution)…
  • Can lead to new types of organism (NoSQL, newSQL, combinations of NoSQL and MySQL)…
  • But it’s our job to evolve them further into more features / functionality than ever before!
  • Because it’s hard….

Dynamo Systems - QCon SF 2012 Presentation Dynamo Systems - QCon SF 2012 Presentation Presentation Transcript