• Save


Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

Like this presentation? Why not share!

Bathcamp 2010-riak






Total Views
Views on SlideShare
Embed Views



1 Embed 14

http://lanyrd.com 14


Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Bathcamp 2010-riak Bathcamp 2010-riak Presentation Transcript

    • Timothy Perrett
      Bath Camp 2010
    • What is Riak?
      Documented orientated database
      Written in Erlang
      Based on Dynamo[1] and CAP Theorem[2]
      Highly fault tolerant
      HTTP and ProtoBuff interface
      Write MapReduce in Erlang or JavaScript
      1. http://goo.gl/r8Np
      2. http://www.julianbrowne.com/article/viewer/brewers-cap-theorem
    • Same, Same but different
      Riak solves similar problems to MongoDB
      Semi-structured data modeled as "documents”
      Storage of non-document data in the database
      High write-availability
      Riak is intrinsically multi-node scalable
      Mongo in comparison is single system (+ sharding)
      Riak achieves availability via quorum writes
      Mongo uses performant in-place writes
      Riak uses “masterless” replication
    • N/R/W – Dynamo
      N = Number of replicas to store
      R = Number of replicas needed to read
      W = Number of replicas needed to read
      These principals first appeared in an Amazon research paper known as Dynamo
    • 160bit integer key space. Each node that joins is assigned part of that space for consistent hashing
      Hashing means any node can service any request making the cluster masterless and eventually consistant
      Number of replicas
    • Number of replies before Riak gives the client a successful reply.
      Tries to access all nodes, but as soon as the N/R is satisfied a response is given
    • Same as reads; W implies the number of successful nodes that must reply before the write is considered consistent by the client
    • Extreme example
      Given N=10, R=W=2 we could have 8 nodes down and the cluster would still be fully available to all clients
    • What does this all mean?
      N/R/W specified at request time, so eachclient can specify its own tolerance foroutages dynamically
      Despite any outages within the cluster, the whole cluster can still appear available based on N/R/W
      Given N=3 and R=W=2, we can have 3-2=1 node down/unreachable/laggy in the cluster
      Stupidly high availability complete with eventual consistency controlled by dynamic clients
    • Brewer’s CAP Theorem
      Partition Tolerance
      You cant have all things, all the time…
      …but you can have some of each, all the time!
      Riak is about choosing your own levels of each according to your use case
    • Consistency
      Start with document version zero
      Things get redistributed and n0 and n2 are sitting in NYC and n1 and n3 are in London
      What if stuff changes??
    • Consistency
      Uh oh: inconsistency
      Both parts of the cluster are still fully available
      NYC serves v1 whilst London serves v0
      The network resumes and Riak determines the latest version by using vector clocks
    • Consistency
      What if both sides of the Atlantic changed?
      Riak is unable to determine which is the right document, both are returned to the client with an indication of the inconsistency
    • Distributed, fault-tolerant full-text searching
      Lucene syntax for queries
      No need for index sharding
      Linier scaling
      Double the number of nodes to get double the search capacity (awesome!)
      Search via:
      Fields, wildcards, fuzzy text or token proximity
      Riak Search
    • Questions?