Your SlideShare is downloading. ×
Bathcamp 2010-riak
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Bathcamp 2010-riak


Published on

Published in: Technology, News & Politics

  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Timothy Perrett Bath Camp 2010
  • 2. What is Riak? • Documented orientated database • Written in Erlang • Based on Dynamo[1] and CAP Theorem[2] • Highly fault tolerant • HTTP and ProtoBuff interface • Write MapReduce in Erlang or JavaScript 1. 2.
  • 3. Same, Same but different • Riak solves similar problems to MongoDB • Semi-structured data modeled as "documents” • Storage of non-document data in the database • High write-availability • Riak is intrinsically multi-node scalable • Mongo in comparison is single system (+ sharding) • Riak achieves availability via quorum writes • Mongo uses performant in-place writes • Riak uses “masterless” replication
  • 4. N/R/W – Dynamo N = Number of replicas to store R = Number of replicas needed to read W = Number of replicas needed to read • These principals first appeared in an Amazon research paper known as Dynamo
  • 5. • 160bit integer key space. Each node that joins is assigned part of that space for consistent hashing • Hashing means any node can service any request making the cluster masterless and eventually consistant Number of replicas
  • 6. • Number of replies before Riak gives the client a successful reply. • Tries to access all nodes, but as soon as the N/R is satisfied a response is given Reads
  • 7. • Same as reads; W implies the number of successful nodes that must reply before the write is considered consistent by the client Writes
  • 8. Extreme example • Given N=10, R=W=2 we could have 8 nodes down and the cluster would still be fully available to all clients
  • 9. What does this all mean? • N/R/W specified at request time, so each client can specify its own tolerance for outages dynamically • Despite any outages within the cluster, the whole cluster can still appear available based on N/R/W • Given N=3 and R=W=2, we can have 3-2=1 node down/unreachable/laggy in the cluster • Stupidly high availability complete with eventual consistency controlled by dynamic clients
  • 10. Brewer’s CAP Theorem • Consistency • Availability • Partition Tolerance • You cant have all things, all the time… • …but you can have some of each, all the time! • Riak is about choosing your own levels of each according to your use case
  • 11. Consistency • Start with document version zero • Things get redistributed and n0 and n2 are sitting in NYC and n1 and n3 are in London • What if stuff changes??
  • 12. Consistency • Uh oh: inconsistency • Both parts of the cluster are still fully available • NYC serves v1 whilst London serves v0 • The network resumes and Riak determines the latest version by using vector clocks
  • 13. Consistency • What if both sides of the Atlantic changed? • Riak is unable to determine which is the right document, both are returned to the client with an indication of the inconsistency
  • 14. • Distributed, fault-tolerant full-text searching • Lucene syntax for queries • No need for index sharding • Linier scaling • Double the number of nodes to get double the search capacity (awesome!) • Search via: • Fields, wildcards, fuzzy text or token proximity Riak Search
  • 15. Questions?