Rusty Klophaus (@rklophaus)
BashoTechnologies
Masterless Distributed
Computing with Riak Core
Erlang User Conference
Stock...
BashoTechnologies
2
About BashoTechnologies
• Offices in:
• Cambridge, Massachusetts
• San Francisco, California
• Distribu...
Riak KV and Riak Search
3
Riak KV
Key/Value Datastore
Map/Reduce, Lightweight Data Relations, Client APIs
Riak Search
Full...
Riak
KV
Riak
Search
Riak
Core
The Common Parts are called Riak Core
Monday, November 22, 2010
Riak
KV
Riak
Search
Riak
Core
The Common Parts are called Riak Core
Distribution / Scaling /
Failure-Tolerance Code
Monday...
Riak Core is an
Open Source Erlang library
that helps you build
distributed, scalable, failure-tolerant
applications using...
“We Generalized the
Dynamo Architecture and
Open-Sourced the Bits.”
7
Monday, November 22, 2010
What Areas are Covered?
Amazon’s Dynamo Paper highlighted to
show parts covered in Riak Core.
Monday, November 22, 2010
Distributed, scalable, failure-tolerant.
9
Monday, November 22, 2010
Distributed, scalable, failure-tolerant.
No central coordinator.
Easy to setup/operate.
10
Monday, November 22, 2010
Distributed, scalable, failure-tolerant.
Horizontally scalable;
add commodity hardware
to get more X.
11
Monday, November ...
Distributed, scalable, failure-tolerant.
Always available.
No single point of failure.
Self-healing.
12
Monday, November 2...
Wait, doesn’t *Erlang* let you build
distributed, scalable, failure-tolerant
applications?
13
Monday, November 22, 2010
Client
Service A Service B
Resource D
Service C
Queue E
Erlang makes it easy to connect the
components of your application...
Service
Node A
Node E
Node I
Node M
Node B
Node F
Node J
Node N
Node C
Node G
Node K
Node O
Node D
Node H
Node L
. . .
Ria...
How does Riak Core work?
16
Monday, November 22, 2010
A Simple Interface...
Command ObjectName, Payload
Send commands, get responses.
How do we route the commands
to physical m...
Hash the Object Name
Command ObjectName, Payload
SHA1(ObjName), Payload
0 to 2^160
Monday, November 22, 2010
A Naive Approach
Command ObjectName, Payload
SHA1(ObjName), Payload
Node A Node B Node C Node D
Monday, November 22, 2010
A Naive Approach
Command
SHA1(ObjName), Payload
Node A Node B Node C Node D Node E
ObjectName, Payload
Existing routes bec...
"All problems in computer
science can be solved by
another level of indirection."
- David Wheeler
21
Monday, November 22, ...
Routing with Consistent Hashing
Command ObjectName, Payload
SHA1(ObjName), Payload
VNode 0 VNode 1 VNode 2 VNode 3 VNode 4...
Adding a Node
Command ObjectName, Payload
SHA1(ObjName), Payload
VNode 0 VNode 1 VNode 2 VNode 3 VNode 4 VNode 5 VNode 6 V...
Removing a Node
Command ObjectName, Payload
SHA1(ObjName), Payload
VNode 0 VNode 1 VNode 2 VNode 3 VNode 4 VNode 5 VNode 6...
The Ring
Hash Location
Monday, November 22, 2010
Writing Replicas (NValue)
Locations when N=3
Monday, November 22, 2010
Routing Around Failures
Locations when N=3
and node 0 is down.
X
Monday, November 22, 2010
The Preflist
Preflist
Monday, November 22, 2010
Location of the Routing Layer
29
Monday, November 22, 2010
Router in the Middle Leads to SPOF
Client Client Client
Router
VNode
0
Node A Node B Node C Node D Node E
VNode
1
VNode
3
...
Riak Core - Router on Each Node
Client Client Client
Router Router Router RouterRouter
VNode
0
Node A Node B Node C Node D...
Eventually - Router in the Client
Client Client Client
VNode
0
Node A Node B Node C Node D Node E
VNode
1
VNode
3
VNode
4
...
How DoThe Routers Reach Agreement?
Router Router Router RouterRouter
VNode
0
Node A Node B Node C Node D Node E
VNode
1
VN...
The Nodes GossipTheir WorldView
Local
Ring State
Incoming
Ring State
Are rings equivalent?
Strictly descendent?
Or differe...
Not Mentioned
Vector Clocks
MerkleTrees
Bloom Filters
35
Monday, November 22, 2010
Building an Application
with Riak Core
36
Monday, November 22, 2010
Building an Application on Riak Core?
Two things to think about:
37
The Command Set
Command = ObjectName, Payload
The comm...
Writing aVNode Module
38
Startup/Shutdown
init([Partition]) ->
{ok, State}
terminate(State) ->
ok
Receive Incoming Command...
Writing aVNode Module
39
Send and Receive Handoff Data
handoff_starting(Node, State) ->
{Bool, State1}
encode_handoff_data...
Start the riak_core application
riak_core
riak_core_vnode_sup riak_core_handoff_*
riak_core_ring_*
riak_core_node_*
riak_c...
Start the riak_core application
riak_core
riak_core_vnode_sup riak_core_handoff_*
riak_core_ring_*
riak_core_node_*
riak_c...
Start the riak_core application
riak_core
riak_core_vnode_sup riak_core_handoff_*
riak_core_ring_*
riak_core_node_*
riak_c...
Start the riak_core application
riak_core
riak_core_vnode_sup riak_core_handoff_*
riak_core_ring_*
riak_core_node_*
riak_c...
Start the riak_core application
riak_core
riak_core_vnode_sup riak_core_handoff_*
riak_core_ring_*
riak_core_node_*
riak_c...
Start the riak_core application
riak_core
riak_core_vnode_sup riak_core_handoff_*
riak_core_ring_*
riak_core_node_*
riak_c...
In your application...
riak_core
riak_core_vnode_sup riak_core_handoff_*
riak_core_ring_*
riak_core_node_*
riak_core_gossi...
In your application...
riak_core
riak_core_vnode_sup riak_core_handoff_*
riak_core_ring_*
riak_core_node_*
riak_core_gossi...
In your application...
riak_core
riak_core_vnode_sup riak_core_handoff_*
riak_core_ring_*
riak_core_node_*
riak_core_gossi...
Start Sending Commands
49
# Figure out the preflist...
{_Verb, ObjName, _Payload} = Command,
PrefList = riak_core_apl:get_...
Review
Riak KV
Open Source Key/Value datastore.
Riak Search
Full-text, near real-time search engine based on Riak Core.
Ri...
Thanks! Questions?
Learn More
http://wiki.basho.com
Read Amazon’s Dynamo Paper
Get the Code
http://github.com/basho/riak_c...
END
Monday, November 22, 2010
Upcoming SlideShare
Loading in...5
×

Masterless Distributed Computing with Riak Core - EUC 2010

5,625

Published on

Riak Core--an open-source Erlang library created by Basho Technologies that powers Riak KV and Riak Search--allows developers to build distributed, scalable, failure-tolerant applications based on a generalized version of Amazon's Dynamo architecture. In this talk, Rusty will explain why Riak Core was built, discuss what problems it solves and how it works, and walk through the steps to using Riak Core in an Erlang application.

Published in: Technology
0 Comments
16 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
5,625
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
151
Comments
0
Likes
16
Embeds 0
No embeds

No notes for slide

Masterless Distributed Computing with Riak Core - EUC 2010

  1. 1. Rusty Klophaus (@rklophaus) BashoTechnologies Masterless Distributed Computing with Riak Core Erlang User Conference Stockholm, Sweden · November 2010 Monday, November 22, 2010
  2. 2. BashoTechnologies 2 About BashoTechnologies • Offices in: • Cambridge, Massachusetts • San Francisco, California • Distributed company • ~20 people • Riak KV and Riak Search (both Open Source) • SLA-based Support and Enterprise Software ($) • Other Open Source Erlang developed by Basho: • Webmachine, Rebar, Bitcask, Erlang_JS, Basho Bench Monday, November 22, 2010
  3. 3. Riak KV and Riak Search 3 Riak KV Key/Value Datastore Map/Reduce, Lightweight Data Relations, Client APIs Riak Search Full-text search and indexing engine Near Realtime Indexing, Riak KV Integration, Solr Support. Common Properties Both are distributed, scalable, failure-tolerant applications. Both based on Amazon’s Dynamo Architecture. Monday, November 22, 2010
  4. 4. Riak KV Riak Search Riak Core The Common Parts are called Riak Core Monday, November 22, 2010
  5. 5. Riak KV Riak Search Riak Core The Common Parts are called Riak Core Distribution / Scaling / Failure-Tolerance Code Monday, November 22, 2010
  6. 6. Riak Core is an Open Source Erlang library that helps you build distributed, scalable, failure-tolerant applications using a Dynamo-style architecture. 6 Monday, November 22, 2010
  7. 7. “We Generalized the Dynamo Architecture and Open-Sourced the Bits.” 7 Monday, November 22, 2010
  8. 8. What Areas are Covered? Amazon’s Dynamo Paper highlighted to show parts covered in Riak Core. Monday, November 22, 2010
  9. 9. Distributed, scalable, failure-tolerant. 9 Monday, November 22, 2010
  10. 10. Distributed, scalable, failure-tolerant. No central coordinator. Easy to setup/operate. 10 Monday, November 22, 2010
  11. 11. Distributed, scalable, failure-tolerant. Horizontally scalable; add commodity hardware to get more X. 11 Monday, November 22, 2010
  12. 12. Distributed, scalable, failure-tolerant. Always available. No single point of failure. Self-healing. 12 Monday, November 22, 2010
  13. 13. Wait, doesn’t *Erlang* let you build distributed, scalable, failure-tolerant applications? 13 Monday, November 22, 2010
  14. 14. Client Service A Service B Resource D Service C Queue E Erlang makes it easy to connect the components of your application. Monday, November 22, 2010
  15. 15. Service Node A Node E Node I Node M Node B Node F Node J Node N Node C Node G Node K Node O Node D Node H Node L . . . Riak Core helps you build a service that harnesses the power of many nodes. Monday, November 22, 2010
  16. 16. How does Riak Core work? 16 Monday, November 22, 2010
  17. 17. A Simple Interface... Command ObjectName, Payload Send commands, get responses. How do we route the commands to physical machines? Monday, November 22, 2010
  18. 18. Hash the Object Name Command ObjectName, Payload SHA1(ObjName), Payload 0 to 2^160 Monday, November 22, 2010
  19. 19. A Naive Approach Command ObjectName, Payload SHA1(ObjName), Payload Node A Node B Node C Node D Monday, November 22, 2010
  20. 20. A Naive Approach Command SHA1(ObjName), Payload Node A Node B Node C Node D Node E ObjectName, Payload Existing routes become invalid when you add/remove nodes. Monday, November 22, 2010
  21. 21. "All problems in computer science can be solved by another level of indirection." - David Wheeler 21 Monday, November 22, 2010
  22. 22. Routing with Consistent Hashing Command ObjectName, Payload SHA1(ObjName), Payload VNode 0 VNode 1 VNode 2 VNode 3 VNode 4 VNode 5 VNode 6 VNode 7 Node A Node B Node C Node D Monday, November 22, 2010
  23. 23. Adding a Node Command ObjectName, Payload SHA1(ObjName), Payload VNode 0 VNode 1 VNode 2 VNode 3 VNode 4 VNode 5 VNode 6 VNode 7 Node A Node B Node C Node D Node E Monday, November 22, 2010
  24. 24. Removing a Node Command ObjectName, Payload SHA1(ObjName), Payload VNode 0 VNode 1 VNode 2 VNode 3 VNode 4 VNode 5 VNode 6 VNode 7 Node A Node B Node C Node D Node E Monday, November 22, 2010
  25. 25. The Ring Hash Location Monday, November 22, 2010
  26. 26. Writing Replicas (NValue) Locations when N=3 Monday, November 22, 2010
  27. 27. Routing Around Failures Locations when N=3 and node 0 is down. X Monday, November 22, 2010
  28. 28. The Preflist Preflist Monday, November 22, 2010
  29. 29. Location of the Routing Layer 29 Monday, November 22, 2010
  30. 30. Router in the Middle Leads to SPOF Client Client Client Router VNode 0 Node A Node B Node C Node D Node E VNode 1 VNode 3 VNode 4 VNode 2 VNode 5 VNode 6 VNode 7 Monday, November 22, 2010
  31. 31. Riak Core - Router on Each Node Client Client Client Router Router Router RouterRouter VNode 0 Node A Node B Node C Node D Node E VNode 1 VNode 3 VNode 4 VNode 2 VNode 5 VNode 6 VNode 7 Monday, November 22, 2010
  32. 32. Eventually - Router in the Client Client Client Client VNode 0 Node A Node B Node C Node D Node E VNode 1 VNode 3 VNode 4 VNode 2 VNode 5 VNode 6 VNode 7 Router RouterRouter Why isn’t this done yet? Time and complexity. Monday, November 22, 2010
  33. 33. How DoThe Routers Reach Agreement? Router Router Router RouterRouter VNode 0 Node A Node B Node C Node D Node E VNode 1 VNode 3 VNode 4 VNode 2 VNode 5 VNode 6 VNode 7 Monday, November 22, 2010
  34. 34. The Nodes GossipTheir WorldView Local Ring State Incoming Ring State Are rings equivalent? Strictly descendent? Or different? Monday, November 22, 2010
  35. 35. Not Mentioned Vector Clocks MerkleTrees Bloom Filters 35 Monday, November 22, 2010
  36. 36. Building an Application with Riak Core 36 Monday, November 22, 2010
  37. 37. Building an Application on Riak Core? Two things to think about: 37 The Command Set Command = ObjectName, Payload The commands/requests/operations that you will send through the system. TheVNode Module The callback module that will receive the commands. Monday, November 22, 2010
  38. 38. Writing aVNode Module 38 Startup/Shutdown init([Partition]) -> {ok, State} terminate(State) -> ok Receive Incoming Commands handle_command(Cmd, Sender, State) -> {noreply, State1} | {reply, Reply, State1} handle_handoff_command(Cmd, Sender, State) -> {noreply, State1} | {reply, ok, State1} Monday, November 22, 2010
  39. 39. Writing aVNode Module 39 Send and Receive Handoff Data handoff_starting(Node, State) -> {Bool, State1} encode_handoff_data(Data, State) -> <<Binary>>. handle_handoff_data(Data, Sender, State) -> {reply, ok, State1} handoff_finished(Node, State) -> {ok, State1} Monday, November 22, 2010
  40. 40. Start the riak_core application riak_core riak_core_vnode_sup riak_core_handoff_* riak_core_ring_* riak_core_node_* riak_core_gossip_* X_vnode . . . X_vnode X_vnode 40 application:start(riak_core). Monday, November 22, 2010
  41. 41. Start the riak_core application riak_core riak_core_vnode_sup riak_core_handoff_* riak_core_ring_* riak_core_node_* riak_core_gossip_* X_vnode . . . X_vnode X_vnode 41 Supervise vnode processes. Monday, November 22, 2010
  42. 42. Start the riak_core application riak_core riak_core_vnode_sup riak_core_handoff_* riak_core_ring_* riak_core_node_* riak_core_gossip_* X_vnode . . . X_vnode X_vnode 42 Start, coordinate, and supervise handoff. Monday, November 22, 2010
  43. 43. Start the riak_core application riak_core riak_core_vnode_sup riak_core_handoff_* riak_core_ring_* riak_core_node_* riak_core_gossip_* X_vnode . . . X_vnode X_vnode 43 Maintain cluster membership information. Monday, November 22, 2010
  44. 44. Start the riak_core application riak_core riak_core_vnode_sup riak_core_handoff_* riak_core_ring_* riak_core_node_* riak_core_gossip_* X_vnode . . . X_vnode X_vnode 44 Monitor node liveness, broadcast to registered modules. Monday, November 22, 2010
  45. 45. Start the riak_core application riak_core riak_core_vnode_sup riak_core_handoff_* riak_core_ring_* riak_core_node_* riak_core_gossip_* X_vnode . . . X_vnode X_vnode 45 Send ring information to other nodes. Reconcile different views of the cluster. Rebalance cluster when nodes join or leave. Monday, November 22, 2010
  46. 46. In your application... riak_core riak_core_vnode_sup riak_core_handoff_* riak_core_ring_* riak_core_node_* riak_core_gossip_* X_vnode . . . X_vnode X_vnode 46 Start the vnodes for your application. Master = { riak_X_vnode_master, { riak_core_vnode_master, start_link, [riak_X_vnode] }, permanent, 5000, worker, [riak_core_vnode_master] }, {ok, { {one_for_one, 5, 10}, [Master]} }. Monday, November 22, 2010
  47. 47. In your application... riak_core riak_core_vnode_sup riak_core_handoff_* riak_core_ring_* riak_core_node_* riak_core_gossip_* X_vnode . . . X_vnode X_vnode 47 Tell riak_core that your application is ready to receive requests. riak_core:register_vnode_module(riak_X_vnode), riak_core_node_watcher:service_up(riak_X, self()) Monday, November 22, 2010
  48. 48. In your application... riak_core riak_core_vnode_sup riak_core_handoff_* riak_core_ring_* riak_core_node_* riak_core_gossip_* X_vnode . . . X_vnode X_vnode riak_core riak_core_vnode_sup riak_core_handoff_* riak_core_ring_* riak_core_node_* riak_core_gossip_* X_vnode . . . X_vnode X_vnode 48 Join to an existing node in the cluster. riak_core_gossip:send_ring(ClusterNode, node()) Monday, November 22, 2010
  49. 49. Start Sending Commands 49 # Figure out the preflist... {_Verb, ObjName, _Payload} = Command, PrefList = riak_core_apl:get_apl(ObjName, NVal, riak_X), # Send the command... riak_core_vnode_master:command(PrefList, Command, riak_X_vnode_master) Monday, November 22, 2010
  50. 50. Review Riak KV Open Source Key/Value datastore. Riak Search Full-text, near real-time search engine based on Riak Core. Riak Core Open Source Erlang library that helps you build distributed, scalable, failure-tolerant applications using a Dynamo-style architecture. 50 Monday, November 22, 2010
  51. 51. Thanks! Questions? Learn More http://wiki.basho.com Read Amazon’s Dynamo Paper Get the Code http://github.com/basho/riak_core Get inTouch rusty@basho.com on Email @rklophaus onTwitter 51 Monday, November 22, 2010
  52. 52. END Monday, November 22, 2010
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×