Timothy Perrett
Bath Camp 2010
What is Riak?
• Documented orientated database
• Written in Erlang
• Based on Dynamo[1] and CAP Theorem[2]
• Highly fault ...
Same, Same but different
• Riak solves similar problems to MongoDB
• Semi-structured data modeled as "documents”
• Storage...
N/R/W – Dynamo
N = Number of replicas to store
R = Number of replicas needed to read
W = Number of replicas needed to read...
• 160bit integer key
space. Each node that
joins is assigned part
of that space for
consistent hashing
• Hashing means any...
• Number of replies
before Riak gives
the client a
successful reply.
• Tries to access all
nodes, but as soon
as the N/R i...
• Same as reads; W
implies the number
of successful nodes
that must reply
before the write
is considered
consistent by
the...
Extreme example
• Given N=10, R=W=2 we
could have 8 nodes
down and the cluster
would still be fully
available to all clien...
What does this all mean?
• N/R/W specified at request time, so each
client can specify its own tolerance for
outages dynam...
Brewer’s CAP Theorem
• Consistency
• Availability
• Partition Tolerance
• You cant have all things, all the time…
• …but y...
Consistency
• Start with document
version zero
• Things get redistributed
and n0 and n2 are
sitting in NYC and n1
and n3 a...
Consistency
• Uh oh: inconsistency
• Both parts of the cluster
are still fully available
• NYC serves v1 whilst
London ser...
Consistency
• What if both sides of
the Atlantic changed?
• Riak is unable to
determine which is the
right document, both
...
• Distributed, fault-tolerant full-text searching
• Lucene syntax for queries
• No need for index sharding
• Linier scalin...
Questions?
basho.com/riak.html
github.com/basho/riak
twitter.com/timperrett
github.com/timperrett
blog.getintheloop.eu
Upcoming SlideShare
Loading in...5
×

Bathcamp 2010-riak

1,155

Published on

Published in: Technology, News & Politics
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,155
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Transcript of "Bathcamp 2010-riak"

  1. 1. Timothy Perrett Bath Camp 2010
  2. 2. What is Riak? • Documented orientated database • Written in Erlang • Based on Dynamo[1] and CAP Theorem[2] • Highly fault tolerant • HTTP and ProtoBuff interface • Write MapReduce in Erlang or JavaScript 1. http://goo.gl/r8Np 2. http://www.julianbrowne.com/article/viewer/brewers-cap-theorem
  3. 3. Same, Same but different • Riak solves similar problems to MongoDB • Semi-structured data modeled as "documents” • Storage of non-document data in the database • High write-availability • Riak is intrinsically multi-node scalable • Mongo in comparison is single system (+ sharding) • Riak achieves availability via quorum writes • Mongo uses performant in-place writes • Riak uses “masterless” replication
  4. 4. N/R/W – Dynamo N = Number of replicas to store R = Number of replicas needed to read W = Number of replicas needed to read • These principals first appeared in an Amazon research paper known as Dynamo
  5. 5. • 160bit integer key space. Each node that joins is assigned part of that space for consistent hashing • Hashing means any node can service any request making the cluster masterless and eventually consistant Number of replicas
  6. 6. • Number of replies before Riak gives the client a successful reply. • Tries to access all nodes, but as soon as the N/R is satisfied a response is given Reads
  7. 7. • Same as reads; W implies the number of successful nodes that must reply before the write is considered consistent by the client Writes
  8. 8. Extreme example • Given N=10, R=W=2 we could have 8 nodes down and the cluster would still be fully available to all clients
  9. 9. What does this all mean? • N/R/W specified at request time, so each client can specify its own tolerance for outages dynamically • Despite any outages within the cluster, the whole cluster can still appear available based on N/R/W • Given N=3 and R=W=2, we can have 3-2=1 node down/unreachable/laggy in the cluster • Stupidly high availability complete with eventual consistency controlled by dynamic clients
  10. 10. Brewer’s CAP Theorem • Consistency • Availability • Partition Tolerance • You cant have all things, all the time… • …but you can have some of each, all the time! • Riak is about choosing your own levels of each according to your use case
  11. 11. Consistency • Start with document version zero • Things get redistributed and n0 and n2 are sitting in NYC and n1 and n3 are in London • What if stuff changes??
  12. 12. Consistency • Uh oh: inconsistency • Both parts of the cluster are still fully available • NYC serves v1 whilst London serves v0 • The network resumes and Riak determines the latest version by using vector clocks
  13. 13. Consistency • What if both sides of the Atlantic changed? • Riak is unable to determine which is the right document, both are returned to the client with an indication of the inconsistency
  14. 14. • Distributed, fault-tolerant full-text searching • Lucene syntax for queries • No need for index sharding • Linier scaling • Double the number of nodes to get double the search capacity (awesome!) • Search via: • Fields, wildcards, fuzzy text or token proximity Riak Search
  15. 15. Questions? basho.com/riak.html github.com/basho/riak twitter.com/timperrett github.com/timperrett blog.getintheloop.eu

×