• Like
  • Save


Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

Bathcamp 2010-riak

Uploaded on


  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. Timothy Perrett
    Bath Camp 2010
  • 2. What is Riak?
    Documented orientated database
    Written in Erlang
    Based on Dynamo[1] and CAP Theorem[2]
    Highly fault tolerant
    HTTP and ProtoBuff interface
    Write MapReduce in Erlang or JavaScript
    1. http://goo.gl/r8Np
    2. http://www.julianbrowne.com/article/viewer/brewers-cap-theorem
  • 3. Same, Same but different
    Riak solves similar problems to MongoDB
    Semi-structured data modeled as "documents”
    Storage of non-document data in the database
    High write-availability
    Riak is intrinsically multi-node scalable
    Mongo in comparison is single system (+ sharding)
    Riak achieves availability via quorum writes
    Mongo uses performant in-place writes
    Riak uses “masterless” replication
  • 4. N/R/W – Dynamo
    N = Number of replicas to store
    R = Number of replicas needed to read
    W = Number of replicas needed to read
    These principals first appeared in an Amazon research paper known as Dynamo
  • 5. 160bit integer key space. Each node that joins is assigned part of that space for consistent hashing
    Hashing means any node can service any request making the cluster masterless and eventually consistant
    Number of replicas
  • 6. Number of replies before Riak gives the client a successful reply.
    Tries to access all nodes, but as soon as the N/R is satisfied a response is given
  • 7. Same as reads; W implies the number of successful nodes that must reply before the write is considered consistent by the client
  • 8. Extreme example
    Given N=10, R=W=2 we could have 8 nodes down and the cluster would still be fully available to all clients
  • 9. What does this all mean?
    N/R/W specified at request time, so eachclient can specify its own tolerance foroutages dynamically
    Despite any outages within the cluster, the whole cluster can still appear available based on N/R/W
    Given N=3 and R=W=2, we can have 3-2=1 node down/unreachable/laggy in the cluster
    Stupidly high availability complete with eventual consistency controlled by dynamic clients
  • 10. Brewer’s CAP Theorem
    Partition Tolerance
    You cant have all things, all the time…
    …but you can have some of each, all the time!
    Riak is about choosing your own levels of each according to your use case
  • 11. Consistency
    Start with document version zero
    Things get redistributed and n0 and n2 are sitting in NYC and n1 and n3 are in London
    What if stuff changes??
  • 12. Consistency
    Uh oh: inconsistency
    Both parts of the cluster are still fully available
    NYC serves v1 whilst London serves v0
    The network resumes and Riak determines the latest version by using vector clocks
  • 13. Consistency
    What if both sides of the Atlantic changed?
    Riak is unable to determine which is the right document, both are returned to the client with an indication of the inconsistency
  • 14. Distributed, fault-tolerant full-text searching
    Lucene syntax for queries
    No need for index sharding
    Linier scaling
    Double the number of nodes to get double the search capacity (awesome!)
    Search via:
    Fields, wildcards, fuzzy text or token proximity
    Riak Search
  • 15. Questions?