basho
Core Concepts
Introduction to Riak
AKQA
24th July 2013
Friday, 26 July 13
WHO AM I?
Joel Jacobson
Technical Evangelist
BashoTechnologies
@joeljacobson
Friday, 26 July 13
Distributed computing is
HARD.
Friday, 26 July 13
PROBLEMS?
• Concurrency and latency at scale
• Data consistency
• Uptime/failover
• MultiTenancy
• SLA’s
Friday, 26 July 13
WHAT IS RIAK?
• Key-Value store + extras
• Distributed and horizontally scalable
• Fault-tolerant
• Highly available
• Bui...
INSPIRED BY AMAZON DYNAMO
• White paper released to describe a database system to be
used for their shopping cart
• Master...
RIAK KEY-VALUE STORE
• Simple operations - GET, PUT, DELETE
• Value is opaque, with metadata
• Extras, e.g.
• Secondary In...
HORIZONTALLY SCALABLE
• Near linear scalability
• Query load and data are spread evenly
• Add more nodes and get more:
• o...
FAULTTOLERANT
• All nodes participate equally - no single point of failure (SPOF)
• All data is replicated
• Clusters self...
HIGHLY AVAILABLE
• Any node can serve client requests
• Fallbacks are used when nodes are down
• Always accepts read and w...
QUORUMS - N/R/W
• Tunable down to bucket level
• n_val = 3 by default
• w / r = 2 by default
• w = 1 - Quicker response ti...
CAPTHEOREM
• C = Consistency
• A = Availability
• P = PartitionTolerance
• Cap theorem states that a
distributed shared da...
THE RING
Friday, 26 July 13
REPLICATION
• Replicated to 3 nodes by default (n_val =3, which is
configurable)
Friday, 26 July 13
DISASTER SCENARIO
• Node fails
• Request goes to fallback
• Node comes back
• Handoff - data retuned to
recovered node
• N...
DISASTER SCENARIO
• Node fails
• Request goes to fallback
• Node comes back
• Handoff - data retuned to
recovered node
• N...
ACTIVE ANTI-ENTROPY
• Automatically repair inconsistencies in data
• Active Anti-Entropy was new in 1.3.0 and uses Merkle ...
CONFLICT RESOLUTION
• Network partitions and concurrent actors modifying the
same data cause data divergence
• Riak provid...
VECTOR CLOCKS
• Every node has an ID
• Send last-seen vector clock in every “put” request
• Can be viewed as ‘commit histo...
SIBLING CREATION
0
32
1
Object
v1
Object
v1
[{a,3}]
[{a,2},{b,1}]
1) 2)
[{a,3}]
[{a,2},{b,1}]
0
32
1
Object
v1
Object v1
O...
STORAGE BACKENDS
• Bitcask
• LevelDB
• Memory
• Multi
Friday, 26 July 13
BITCASK
• A fast, append-only key-value store
• In memory key lookup table (key_dir) data on disk
• Closed files are immuta...
LEVELDB
• Key-Value storage developed by Google
• Append-only for very large data sets
• Multiple levels of SSTable-like d...
MEMORY
• Data is never persisted to disk
• Typically used for “test” databases
(unit tests... etc)
• Definable memory limit...
MULTI
• Configure multiple storage engines for different types of data
• Configure the “default” storage engine
• Choose sto...
CLIENT APIS
• Riak supports two main client types:
• REST based HTTP Interface
• Easy to use from command line and simple ...
CLIENT LIBRARIES
• Client libraries supported by Basho:
• Community supported languages and frameworks:
• C/C++, Clojure, ...
• Using Riak as datastore for all back-end systems supporting
Angry Birds
• Game-state storage, ID/Login, Payments, Push n...
• Spine2 project - storing patient data (80 million+)
• 500 complex messages per second
• 20,000 integrated end points
• 0...
• Push to talk application
• Billions of requests daily
• > 50 dedicated servers
• Everything stored in Riak
• https://git...
MULTI DATACENTER
REPLICATION (MDC)
• Allows data to be replicated between clusters in different data
centers. Can handle l...
RIAK-CS
• Built on top of Riak and supports MDC
• S3 compatible object storage
• Supports multi-tenancy
• Per-tenant usage...
PLAY AROUND WITH RIAK?
• https://github.com/joeljacobson/riak-dev-cluster
• https://github.com/joeljacobson/vagrant-riak-c...
THANKYOU
joel@basho.com
basho
Friday, 26 July 13
Upcoming SlideShare
Loading in...5
×

Introduction to Riak - Joel Jacobson

449

Published on

In this talk we will take a look at the core concepts of Riak and why you might want to use it for your application, we will then take a look at some customer use cases and how Riak helped them scale with ease.

Joel Jacobson is a Technical Evangelist at Basho Technologies where he helps build the Riak community across Europe. Prior to joining Basho, Joel worked closely with Neo Technologies as part of his role at the consultancy OpenCredo.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
449
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
15
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Introduction to Riak - Joel Jacobson

  1. 1. basho Core Concepts Introduction to Riak AKQA 24th July 2013 Friday, 26 July 13
  2. 2. WHO AM I? Joel Jacobson Technical Evangelist BashoTechnologies @joeljacobson Friday, 26 July 13
  3. 3. Distributed computing is HARD. Friday, 26 July 13
  4. 4. PROBLEMS? • Concurrency and latency at scale • Data consistency • Uptime/failover • MultiTenancy • SLA’s Friday, 26 July 13
  5. 5. WHAT IS RIAK? • Key-Value store + extras • Distributed and horizontally scalable • Fault-tolerant • Highly available • Built for the web Friday, 26 July 13
  6. 6. INSPIRED BY AMAZON DYNAMO • White paper released to describe a database system to be used for their shopping cart • Masterless, peer-coordinated replication • Dynamo inspired data-stores; Riak, Cassandra, Voldemort etc. • Consistent hashing - no sharding :-) • Eventually consistent Friday, 26 July 13
  7. 7. RIAK KEY-VALUE STORE • Simple operations - GET, PUT, DELETE • Value is opaque, with metadata • Extras, e.g. • Secondary Indexes (2i) • MapReduce • Full text search Friday, 26 July 13
  8. 8. HORIZONTALLY SCALABLE • Near linear scalability • Query load and data are spread evenly • Add more nodes and get more: • ops/second • storage capacity • compute power (for Map/Reduce) Friday, 26 July 13
  9. 9. FAULTTOLERANT • All nodes participate equally - no single point of failure (SPOF) • All data is replicated • Clusters self heal - Handoff, Active Anti-Entropy • Cluster transparently survives... • node failure • network partitions • Built on Erlang/OTP (designed for FT) Friday, 26 July 13
  10. 10. HIGHLY AVAILABLE • Any node can serve client requests • Fallbacks are used when nodes are down • Always accepts read and write requests • Per-request quorums Friday, 26 July 13
  11. 11. QUORUMS - N/R/W • Tunable down to bucket level • n_val = 3 by default • w / r = 2 by default • w = 1 - Quicker response time, read could be inconsistent in short term • w = all - Slower response, increased data consistency Friday, 26 July 13
  12. 12. CAPTHEOREM • C = Consistency • A = Availability • P = PartitionTolerance • Cap theorem states that a distributed shared data system can at most support 2 out of these 3 properties DB DB DB Client Client Network/Data Partition Friday, 26 July 13
  13. 13. THE RING Friday, 26 July 13
  14. 14. REPLICATION • Replicated to 3 nodes by default (n_val =3, which is configurable) Friday, 26 July 13
  15. 15. DISASTER SCENARIO • Node fails • Request goes to fallback • Node comes back • Handoff - data retuned to recovered node • Normal operations resume automatically Friday, 26 July 13
  16. 16. DISASTER SCENARIO • Node fails • Request goes to fallback • Node comes back • Handoff - data retuned to recovered node • Normal operations resume automatically hash(“user_id”) Friday, 26 July 13
  17. 17. ACTIVE ANTI-ENTROPY • Automatically repair inconsistencies in data • Active Anti-Entropy was new in 1.3.0 and uses Merkle trees to compare data in partitions and periodically ensure consistency • Active Anti-Entropy runs as a background process • Can also be configured as a manual process Friday, 26 July 13
  18. 18. CONFLICT RESOLUTION • Network partitions and concurrent actors modifying the same data cause data divergence • Riak provides two solutions to manage this that can be set on bucket level: • Last Write Wins - an approach used for some use cases • Vector Clocks - Retain “sibling” copies of data for merging Friday, 26 July 13
  19. 19. VECTOR CLOCKS • Every node has an ID • Send last-seen vector clock in every “put” request • Can be viewed as ‘commit history’ e.g Git • Lets you decide conflicts Friday, 26 July 13
  20. 20. SIBLING CREATION 0 32 1 Object v1 Object v1 [{a,3}] [{a,2},{b,1}] 1) 2) [{a,3}] [{a,2},{b,1}] 0 32 1 Object v1 Object v1 Object v1 • Siblings can be created by: • Simultaneous writes (based on same object version) • Network partitions • Writes to existing key without submitting vector clock Friday, 26 July 13
  21. 21. STORAGE BACKENDS • Bitcask • LevelDB • Memory • Multi Friday, 26 July 13
  22. 22. BITCASK • A fast, append-only key-value store • In memory key lookup table (key_dir) data on disk • Closed files are immutable • Merging cleans up old data • Developed by BashoTechnologies • Suitable for bounded data, e.g. reference data Friday, 26 July 13
  23. 23. LEVELDB • Key-Value storage developed by Google • Append-only for very large data sets • Multiple levels of SSTable-like data structures • Allows for more advanced querying (2i) • It includes compression (Snappy algorithm) • Suitable for unbounded data or advanced querying Friday, 26 July 13
  24. 24. MEMORY • Data is never persisted to disk • Typically used for “test” databases (unit tests... etc) • Definable memory limits per vnode • Configurable object expiry • Useful for highly transient data Friday, 26 July 13
  25. 25. MULTI • Configure multiple storage engines for different types of data • Configure the “default” storage engine • Choose storage engine on per bucket basis • No reason not to use it Friday, 26 July 13
  26. 26. CLIENT APIS • Riak supports two main client types: • REST based HTTP Interface • Easy to use from command line and simple scripts • Useful if using intermediate caching layer, e.g.Varnish • Protocol Buffers • Optimized binary encoding standard developed by Google • More performant than HTTP interface Friday, 26 July 13
  27. 27. CLIENT LIBRARIES • Client libraries supported by Basho: • Community supported languages and frameworks: • C/C++, Clojure, Common Lisp, Dart, Django, Go, Grails, Griffon, Groovy, Erlang, Haskell, Java, .NET, Node.js, OCaml , Perl, PHP, Play, Python, Racket, Ruby, Scala, Smalltalk Friday, 26 July 13
  28. 28. • Using Riak as datastore for all back-end systems supporting Angry Birds • Game-state storage, ID/Login, Payments, Push notifications, analytics, advertisements • 9 clusters in use with over 100 nodes • 263 million active monthly users Friday, 26 July 13
  29. 29. • Spine2 project - storing patient data (80 million+) • 500 complex messages per second • 20,000 integrated end points • 0 data loss • 99.9% availability SLA Friday, 26 July 13
  30. 30. • Push to talk application • Billions of requests daily • > 50 dedicated servers • Everything stored in Riak • https://github.com/mranney/node_riak Friday, 26 July 13
  31. 31. MULTI DATACENTER REPLICATION (MDC) • Allows data to be replicated between clusters in different data centers. Can handle larger latencies. • Two synchronization modes that can be used together: real- time and full sync • Set up as uni-directional or bi-directional replication • Can be used for global load-balancing, business continuity and back-ups Friday, 26 July 13
  32. 32. RIAK-CS • Built on top of Riak and supports MDC • S3 compatible object storage • Supports multi-tenancy • Per-tenant usage data and statistics on network I/O • Supports Objects of Arbitrary ContentType Up to 5TB • Often used to build private cloud storage Friday, 26 July 13
  33. 33. PLAY AROUND WITH RIAK? • https://github.com/joeljacobson/riak-dev-cluster • https://github.com/joeljacobson/vagrant-riak-cluster Friday, 26 July 13
  34. 34. THANKYOU joel@basho.com basho Friday, 26 July 13
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×