Process Design and Polymorphism:
Lessons Learnt from Development of Kai
takemaru
08.11.20
1
Outline
  Reviewing Dynamo and Kai
  Kai: an Open Source Implementation of Amazon’s Dynamo
  Features and Mechanism
  ...
Dynamo: Features
3
  Key, value store
  Distributed hash table
  High scalability
  No master, peer-to-peer
  Large s...
Dynamo: Partitioning
08.11.20
4
  Consistent Hashing
  Nodes and keys are
positioned at their hash
values
  MD5 (128bit...
Dynamo: Membership
08.11.20
5
  Gossip-based protocol
  Spreads membership like a
rumor
  Membership contains node list...
Dynamo:
get/put Operations
08.11.20
6
  Client
1.  Sends a request any of
Dynamo node
  The request is forwarded to
coor...
Kai: Overview
08.11.20
7
  Kai
  Open source implementation of Amazon’s Dynamo
  Named after my origin
  OpenDynamo ha...
Process Design for Better
Performance
08.11.20
8
Two approaches to improve software
performance
08.11.20
9
  To process it with less CPU resource
  Solved by introducing...
Two approaches to improve software
performance
08.11.20
10
  To process it with less CPU resource
  Solved by introducin...
Diagram Convention
08.11.20
11
  Procedural programming
  Procedures are called by a process
foo
 bar
process foo
-modul...
Diagram Convention, cont’d
08.11.20
12
  Actor model
  2 processes interact with each other
foo
 bar
-module(bar).
-beha...
Design Rules in Erlang
08.11.20
13
  Some of rules on process design:
  “Assign exactly one parallel process to each tru...
Processes in getting/putting data
08.11.20
14
Node 1
coordinator
Node 2
Client
 memcache
 rpc
 rpc
For clarity, details of...
Node 2
Node 1
Sequence in getting/putting data
08.11.20
15
memcache
 coordinator
 rpc
 coordinator
 store
Client
Blocked…
Node 2
Node 1
Sequence in getting/putting data, cont’d
08.11.20
16
memcache
 coordinator
 rpc
 coordinator
 store
Client
A...
Node 2
Node 1
Sequence in getting/putting data, cont’d
08.11.20
17
memcache
 coordinator
 rpc
 coordinator
 store
Client
A...
Design Rules in Erlang, again
08.11.20
18
  Another rule on process design:
  “Use many processes”
  Many processes are...
Processes in getting/putting data, again
08.11.20
19
Node 1
coordinator
Node 2
Client
 rpc
store
data
coordinator
rpc
memc...
Node 2
Node 1
Sequence in getting/putting data, again
08.11.20
20
memcache
 coordinator
 rpc
 coordinator
 store
Client
An...
Node 2
Node 1
Sequence in getting/putting data, again
08.11.20
21
memcache
 coordinator
 rpc
 coordinator
 store
Client
An...
Design Rules in Erlang, in final
08.11.20
22
  Another rule on process design:
  “Don’t spawn stateless processes”
  Ca...
Processes in getting/putting data, in final
08.11.20
23
Node 1
 Node 2
Client
 rpc
store
data
rpc
memcache
memcache
 rpc
r...
Node 2
Node 1
Sequence in getting/putting data, in final
08.11.20
24
memcache/coordinator
 rpc
 coordinator
 store
Client
...
Lessons Learnt
08.11.20
25
  Process design based on calling sequence and process state
  Externally called module
  Mu...
Process Relationship in Kai
08.11.20
26
memcache
store
data
memcache
 rpc
rpc
hash
members,
buckets
connection
version
coo...
Process Relationship in Kai, cont’d
08.11.20
27
store
data
rpc
rpc
hash
members,
buckets
connection
version
sockets
proces...
Process Relationship in Kai, cont’d
08.11.20
28
  “Lessons learnt” are almost satisfied
  Externally called modules are ...
Advanced Issues
08.11.20
29
  More concurrency may be introduced if needed
  Design rules from “Lessons Learnt” can be a...
Advanced Issues, cont’d
08.11.20
30
  Is blocking never occurred in pure functional
programming?
  In Kai, a process rec...
Polymorphism in Actor Model
08.11.20
31
What’s Polymorphism?
08.11.20
32
  “a programming language feature that allows values of
different data types to be handl...
Polymorphism in Java
08.11.20
33
  Interface
  Implementation class
interface Animal {
void bark();
}
class Dog implemen...
Polymorphism in Java, cont’d
08.11.20
34
  Abstract class
  Concrete class
abstract class Animal {
public void bark() {
...
Polymorphism Inspired by Interface
08.11.20
35
  Interface module
  Initializes implementation module with a name
  Cal...
Polymorphism Inspired by Interface, cont’d
08.11.20
36
-module(animal).
start_link(Mod) ->
Mod:start_link(?MODULE).
bark()...
Polymorphism Inspired by Interface, cont’d
08.11.20
37
  How to use
animal:start_link(dog),
animal:bark(). % Bow wow
Polymorphism Inspired by Interface, cont’d
08.11.20
38
  Example in Kai
  Provides two types of local storage with a sin...
Polymorphism Inspired by Abstract Class
08.11.20
39
  Abstract module
  Defines abstract functions by using behavior mec...
Polymorphism Inspired by Abstract Class,
cont’d
08.11.20
40
-module(animal).
-behaviour(gen_sever).
behaviour_info(callbac...
Polymorphism Inspired by Abstract Class,
cont’d
08.11.20
41
  How to use
  Same as an example of interface
-module(dog)....
Polymorphism Inspired by Abstract Class,
cont’d
08.11.20
42
  Example in Kai
  Provides two types of TCP listening proce...
Lessons Learnt
08.11.20
43
  Two approaches to implement polymorphism
  Inspired by interface
  Simple
  Not efficient...
Summary
08.11.20
44
Outline
  Reviewing Dynamo and Kai
  Kai: an Open Source Implementation of Amazon’s Dynamo
  Features and Mechanism
  ...
Kai: Roadmap
08.11.20
46
1.  Initial implementation (May, 2008)
  1,000 L
2.  Current status
  2,200 L
  Following tick...
Upcoming SlideShare
Loading in …5
×

Process Design and Polymorphism: Lessons Learnt from Development of Kai

3,103 views
3,026 views

Published on

Published in: Technology, News & Politics
0 Comments
9 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
3,103
On SlideShare
0
From Embeds
0
Number of Embeds
229
Actions
Shares
0
Downloads
66
Comments
0
Likes
9
Embeds 0
No embeds

No notes for slide

Process Design and Polymorphism: Lessons Learnt from Development of Kai

  1. 1. Process Design and Polymorphism: Lessons Learnt from Development of Kai takemaru 08.11.20 1
  2. 2. Outline   Reviewing Dynamo and Kai   Kai: an Open Source Implementation of Amazon’s Dynamo   Features and Mechanism   Process Design for Better Performance   Based on Calling Sequence and Process State   Polymorphism in Actor Model   Two approaches to implement polymorphism 08.11.20 2
  3. 3. Dynamo: Features 3   Key, value store   Distributed hash table   High scalability   No master, peer-to-peer   Large scale cluster, maybe O(1K)   Fault tolerant   Even if an entire data center fails   Meets latency requirements in the case 08.11.20 get(“cart/joe”) joe joe joe
  4. 4. Dynamo: Partitioning 08.11.20 4   Consistent Hashing   Nodes and keys are positioned at their hash values   MD5 (128bits)   Keys are stored in the following N nodes joe Hash ring (N=3) 2^128 0 1 2 hash(node3) hash(node1) hash(node2) hash(node5) hash(node4) hash(cart/joe) hash(node6)
  5. 5. Dynamo: Membership 08.11.20 5   Gossip-based protocol   Spreads membership like a rumor   Membership contains node list and change history   Exchanges membership with a node at random every second   Updates membership if more recent one received   Advantages   Robust; no one can prevent a rumor from spreading   Exponentially rapid spread Membership change is spread by the form of gossip Detecting node failure hash(node2) node2 is down node2 is down node2 is down node2 is down
  6. 6. Dynamo: get/put Operations 08.11.20 6   Client 1.  Sends a request any of Dynamo node   The request is forwarded to coordinator   Coordinator: one of nodes associated with the key   Coordinator 1.  Chooses N nodes by using consistent hashing 2.  Forwards a request to N nodes 3.  Waits responses from R or W nodes, or timeouts 4.  Checks replica versions if get 5.  Sends a response to client get/put operations for N,R,W = 3,2,2 Request Response Client Coordinator joe joe joe
  7. 7. Kai: Overview 08.11.20 7   Kai   Open source implementation of Amazon’s Dynamo   Named after my origin   OpenDynamo had been taken by a project not related to Amazon’s Dynamo    Written in Erlang   memcache API   Found at http://sourceforge.net/projects/kai/
  8. 8. Process Design for Better Performance 08.11.20 8
  9. 9. Two approaches to improve software performance 08.11.20 9   To process it with less CPU resource   Solved by introducing better algorithms   e.g. Binary search is used instead of linear search   To use all CPU resources   Issues identified in multi-core environment   Solved by rearranging process-core mapping   e.g. Heavy process and light process run on multi-core 1st core 2nd core boring… heavy process light process
  10. 10. Two approaches to improve software performance 08.11.20 10   To process it with less CPU resource   Solved by introducing better algorithms   e.g. Binary search is used instead of linear search   To use all CPU resources   Issues identified in multi-core environment   Solved by rearranging process-core mapping   e.g. Heavy process and light process run on multi-core 1st core 2nd core work! work! heavy process light process Latter case will be discussed
  11. 11. Diagram Convention 08.11.20 11   Procedural programming   Procedures are called by a process foo bar process foo -module(foo). -behaviour(gen_server). ok(State) -> {reply, bar:ok(), State}. -module(bar). ok() -> ok. process ellipse rectangle not process
  12. 12. Diagram Convention, cont’d 08.11.20 12   Actor model   2 processes interact with each other foo bar -module(bar). -behaviour(gen_server). ok(State) -> {reply, ok, State}. ok() -> gen_server:call(?MODULE, ok). process bar process foo process ellipse rectangle not process -module(foo). -behaviour(gen_server). ok(State) -> {reply, bar:ok(), State}.
  13. 13. Design Rules in Erlang 08.11.20 13   Some of rules on process design:   “Assign exactly one parallel process to each true concurrent activity in the system”   “Each process should only have one role”   from Program Development Using Erlang - Programming Rules and Conventions
  14. 14. Processes in getting/putting data 08.11.20 14 Node 1 coordinator Node 2 Client memcache rpc rpc For clarity, details of architecture are omitted. store data coordinator rpc Choosing a node having responsibility for requested key
  15. 15. Node 2 Node 1 Sequence in getting/putting data 08.11.20 15 memcache coordinator rpc coordinator store Client Blocked…
  16. 16. Node 2 Node 1 Sequence in getting/putting data, cont’d 08.11.20 16 memcache coordinator rpc coordinator store Client Another client Blocked…
  17. 17. Node 2 Node 1 Sequence in getting/putting data, cont’d 08.11.20 17 memcache coordinator rpc coordinator store Client Another client Accepted and… Blocked!
  18. 18. Design Rules in Erlang, again 08.11.20 18   Another rule on process design:   “Use many processes”   Many processes are almost uniformly assigned to each processor by statistical effect   from Chap.20 Programming Erlang
  19. 19. Processes in getting/putting data, again 08.11.20 19 Node 1 coordinator Node 2 Client rpc store data coordinator rpc memcache memcache rpc rpc coordinator coordinator For clarity, details of architecture are omitted. Spawning multiple coordinator processes
  20. 20. Node 2 Node 1 Sequence in getting/putting data, again 08.11.20 20 memcache coordinator rpc coordinator store Client Another client Accepted and…
  21. 21. Node 2 Node 1 Sequence in getting/putting data, again 08.11.20 21 memcache coordinator rpc coordinator store Client Another client Accepted and… Spawned It’s OK, but looks like overkill. Concurrency is produced by memcache module. Why it is reproduced by coordinator?
  22. 22. Design Rules in Erlang, in final 08.11.20 22   Another rule on process design:   “Don’t spawn stateless processes”   Called as procedures from concurrent processes   Introduced by me 
  23. 23. Processes in getting/putting data, in final 08.11.20 23 Node 1 Node 2 Client rpc store data rpc memcache memcache rpc rpc For clarity, details of architecture are omitted. Not spawned coordinator coordinator
  24. 24. Node 2 Node 1 Sequence in getting/putting data, in final 08.11.20 24 memcache/coordinator rpc coordinator store Client Another client Accepted and processed
  25. 25. Lessons Learnt 08.11.20 25   Process design based on calling sequence and process state   Externally called module   Must be spawned to produce concurrency   Runs as multiple processes if needed   e.g.TCP listening process, timer process   Stateless module   No need to be spawned   e.g. coordinator of Kai   Stateful module   Should be spawned for state consistency   Runs as multiple processes if possible   If a single process, it must be a terminal one (never call other processes synchrnously) to avoid blocking   e.g. database process, socket pool
  26. 26. Process Relationship in Kai 08.11.20 26 memcache store data memcache rpc rpc hash members, buckets connection version coordinator sockets process ellipse rectangle not process state current version vclock config configs Relationships between stateful processes and other modules are omitted.
  27. 27. Process Relationship in Kai, cont’d 08.11.20 27 store data rpc rpc hash members, buckets connection version sockets process ellipse rectangle not process membership sync with timer vclock current version config configs Relationships between stateful processes and other modules are omitted.
  28. 28. Process Relationship in Kai, cont’d 08.11.20 28   “Lessons learnt” are almost satisfied   Externally called modules are spawned   As multiple processes if needed   e.g. kai_rpc, kai_memcache, kai_sync, kai_membership   Stateless modules are not spawned   e.g. kai_coordinator   Stateful modules are spawned   e.g. kai_config, kai_version, kai_store, kai_hash, kai_connection   However, some of them are NOT terminal ones   e.g. connection module calling config process, is potential bottle neck   “Lessons learnt” can point out potential bottle necks   Yes, connection module is just a thing!
  29. 29. Advanced Issues 08.11.20 29   More concurrency may be introduced if needed   Design rules from “Lessons Learnt” can be applied locally   e.g. coordinator produces N concurrency for asynchronous calls   Referred to Web application servers in MVC model Web application servers Process Design from “Lessons Learnt” Concurrency is produced by Web servers, e.g.Apache Externally called modules Application is controlled by Controller of MVC Stateless modules State is managed by Model of MVC Stateful modules
  30. 30. Advanced Issues, cont’d 08.11.20 30   Is blocking never occurred in pure functional programming?   In Kai, a process receiving requests waits for data to be replied   In pure functional programming, another process handles data to be replied?   Not straightforward for me… Blocking Req. Req. and socket Res. Res. and socket Kai Pure functional programming model?
  31. 31. Polymorphism in Actor Model 08.11.20 31
  32. 32. What’s Polymorphism? 08.11.20 32   “a programming language feature that allows values of different data types to be handled using a uniform interface”   from Wikipedia
  33. 33. Polymorphism in Java 08.11.20 33   Interface   Implementation class interface Animal { void bark(); } class Dog implements Animal { public void bark() { System.out.println(“Bow wow”); } } Including no implementation
  34. 34. Polymorphism in Java, cont’d 08.11.20 34   Abstract class   Concrete class abstract class Animal { public void bark() { System.out.println(this.yap()); } abstract String yap(); } class Dog extends Animal { String yap() { return “Bow wow”; } } Including some implementation
  35. 35. Polymorphism Inspired by Interface 08.11.20 35   Interface module   Initializes implementation module with a name   Calls the process with the name   Implementation module   Spawns a process and registers it as the given name   Implements actual logics, which are called from interface
  36. 36. Polymorphism Inspired by Interface, cont’d 08.11.20 36 -module(animal). start_link(Mod) -> Mod:start_link(?MODULE). bark() -> gen_server:call(?MODULE, bark). -module(dog). -behaviour(gen_server). start_link(ServerName) -> gen_server:start_link({local, ServerName}, ?MODULE, [], []). handle_call(bark, _From, State) -> bark(State). bark(State) -> io:format(“Bow wow~n”), {reply, ok, State}. Some required callbacks are omitted.
  37. 37. Polymorphism Inspired by Interface, cont’d 08.11.20 37   How to use animal:start_link(dog), animal:bark(). % Bow wow
  38. 38. Polymorphism Inspired by Interface, cont’d 08.11.20 38   Example in Kai   Provides two types of local storage with a single interface   Interface module   kai_store   Implementation modules   kai_store_ets, uses ets, memory storage   kai_store_dets, uses dets, disk storage   See actual codes in detail
  39. 39. Polymorphism Inspired by Abstract Class 08.11.20 39   Abstract module   Defines abstract functions by using behavior mechanism   Spawns a process and stores a name of concrete module   Implements base logics   Calls the process   Concrete module   Implements callbacks, which are called from the abstract module
  40. 40. Polymorphism Inspired by Abstract Class, cont’d 08.11.20 40 -module(animal). -behaviour(gen_sever). behaviour_info(callbacks) -> [{yap, 0}]; % abstract void yap(); behaviour_info(_Other) -> undefined. start_link(Mod) -> gen_server:start_link({local, ?MODULE}, ?MODULE, [Mod], []). init(_Args = [Mod]) -> {ok, _State = {Mod}}. bark(_State = {Mod}) -> io:format("~s~n", [Mod:yap()]), {reply, ok, Mod}. handle_call(bark, _From, State) -> bark(State). bark() -> gen_server:call(?MODULE, bark). Some required callbacks are omitted.
  41. 41. Polymorphism Inspired by Abstract Class, cont’d 08.11.20 41   How to use   Same as an example of interface -module(dog). -behaviour(animal). yap() -> "Bow wow”. animal:start_link(dog), animal:bark(). % Bow wow
  42. 42. Polymorphism Inspired by Abstract Class, cont’d 08.11.20 42   Example in Kai   Provides two types of TCP listening processes with a single interface   Abstract module (behavior)   kai_tcp_server   Concrete modules   kai_rpc, listens RPC calls from other Kai nodes   kai_memcache, listens requests from memcache clients   See actual codes in detail
  43. 43. Lessons Learnt 08.11.20 43   Two approaches to implement polymorphism   Inspired by interface   Simple   Not efficient   Actual logics have to be implemented in each child   Inspired by abstract class   Erlang-way   Efficient   Abstract class can be shared by children
  44. 44. Summary 08.11.20 44
  45. 45. Outline   Reviewing Dynamo and Kai   Kai: an Open Source Implementation of Amazon’s Dynamo   Features and Mechanism   Process Design for Better Performance   Process Design Based on Calling Sequence and Process State   Polymorphism in Actor Model   Two approaches to implement polymorphism 08.11.20 45
  46. 46. Kai: Roadmap 08.11.20 46 1.  Initial implementation (May, 2008)   1,000 L 2.  Current status   2,200 L   Following tickets done Module Task kai_coordinator Requests from clients will be routed to coordinators kai_version Vector clocks kai_store Persistent storage kai_rpc, kai_memcache Process pool kai_connection Connection pool

×