Kai – An Open Source Implementation of Amazon’s Dynamo (in Japanese)
Upcoming SlideShare
Loading in...5
×
 

Kai – An Open Source Implementation of Amazon’s Dynamo (in Japanese)

on

  • 5,431 views

 

Statistics

Views

Total Views
5,431
Views on SlideShare
4,472
Embed Views
959

Actions

Likes
5
Downloads
63
Comments
0

6 Embeds 959

http://teahut.sakura.ne.jp 943
http://www.slideshare.net 11
http://webcache.googleusercontent.com 2
http://209.85.171.104 1
http://72.14.235.104 1
http://translate.googleusercontent.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Kai – An Open Source Implementation of Amazon’s Dynamo (in Japanese) Kai – An Open Source Implementation of Amazon’s Dynamo (in Japanese) Presentation Transcript

  • Kai – An Open Source Implementation of Amazon’s Dynamo takemaru 08.6.17
  • Outline Amazon’s Dynamo   Motivation   Features   Algorithms   Kai   Build and Run   Internals   Roadmap   08.6.17
  • Dynamo: Motivation   75K query/sec ( )   500 req/sec * 150 query/req   O(10M)   RDBMS ?       Amazon   Dynamo   SimpleDB   S3   Dynamo     08.6.17
  • Dynamo: Features Key, value       P2P   O(1K)   get(“cart/joe”)     joe   joe joe 08.6.17
  • Dynamo: Features, cont’d Service Level Agreements   99.9% 300ms   15ms, 30ms       get(“cart/joe”) (Eventually Consistent)     joe   -  RDBMS joe -  Dynamo joe 08.6.17
  • Dynamo: Overview Dynamo (instance)   joe   bob N   bob joe bob Dynamo APIs   joe get(key)   put(key, value, context)   Dynamo of N = 3 memcache cas context   delete     memcache     08.6.17
  • Dynamo: Partitioning Consistent Hashing 2^128   0 1 2   hash(node5) hash(node4) MD5 (128bits)   N   hash(node6) hash(node2) hash(node3) joe hash(cart/joe) hash(node1) Hash ring (N=3) 08.6.17
  • Dynamo: Partitioning, cont’d 2^128   0 1 2   hash(node5) hash(node4) N   hash(node6) hash(node2) hash(node3) joe hash(cart/joe) hash(node1) Remapped range Node2 is down 08.6.17
  • in data center A Dynamo: Partitioning in data center B 2^128   0 1 2   hash(node5)   hash(node4)     hash(node6) hash(node2)   hash(node3)   joe hash(cart/joe) hash(node1) joe is replicated in multiple data centers 08.6.17
  • Dynamo: Partitioning, cont’d   2^128 0 1 2 hash(node3-1)   hash(node5) O(100) /   hash(node4) hash(node1-1) hash(node5-1) hash(node2-1)   hash(node6-1)   hash(node6) hash(node2) hash(node3)   hash(node5-1) hash(node1)   Two virtual nodes per each physical node
  • Dynamo: Partitioning, cont’d 2^128   0 1 2       bob and joe are bob in same bucket     joe Merkle   Divided into 8 buckets 08.6.17
  • Dynamo: Membership     Detecting node failure   node2 is down   node2 is down   node2 is down hash(node2)   node2 is down     Membership change is spread by the form of gossip 08.6.17
  • Dynamo: Membership, cont’d Chord   Brown knows only yellows   > O(10K)   O(log v)   Who has “cart/ken”? v   ken     < O(log v)     Routing in Chord 08.6.17
  • Dynamo: Request Response get/put Operations Client   1.    Coordinator   joe   Consistent hashing N 1.  2.  R W 3.  joe get 4.  5.  joe get/put operations for N,R,W = 3,2,2 08.6.17
  • Dynamo: Request Response get/put Operations, Cont’d Client   (Quorum)   Coordinator N:   joe R:   W:     R+W > N   joe   R<N joe     N,R,W = 3,2,2   get/put operations for N,R,W = 3,2,2 08.6.17
  • Dynamo: Versioning Vector Clocks             (A:2, C:1) A 2, B 1   Independent Effect nodeA (A:0) (A:1) (A:2, C:1) (A:3, B:1, C:1) nodeB (A:1, B:1, C:1) (A:3, B:2, C:1) Cause nodeC (A:1, C:1) (A:2, C:2) Independent Time Cause and Effect of version (A:1, B:1, C:1) 08.6.17
  • Dynamo: Versioning, cont’d Vector Clocks       Vector Clocks   Independent Effect nodeA (A:0) (A:1) (A:2, C:1) (A:3, B:1, C:1) nodeB (A:1, B:1, C:1) (A:3, B:2, C:1) Cause nodeC (A:1, C:1) (A:2, C:2) Independent Time Cause and Effect of version (A:1, B:1, C:1) 08.6.17
  • Dynamo: Versioning, cont’d Vector Clocks   Vector Clocks       : (A:1, B:1, C:1) < (A:3, B:1, C:1)       : (A:1, B:1, C:1) ? (A:2, C:1)   Independent Effect nodeA (A:0) (A:1) (A:2, C:1) (A:3, B:1, C:1) nodeB (A:1, B:1, C:1) (A:3, B:2, C:1) Cause nodeC (A:1, C:1) (A:2, C:2) Independent Time Cause and Effect of version (A:1, B:1, C:1) 08.6.17
  • Dynamo: Synchronization Merkle   (Merkle )       node1 3377 node2 5C37 F1C9 12D5 E334 12D5 8FF3 9F9D F632 34B7 8FF3 9F9D F632 E1F3 win mac linux bsd win mac linux bsd Comparison of hierarchical checksums in Merkle trees 08.6.17
  • Dynamo: Synchronization, cont’d Merkle   Merkle     1. Compare node1 3377 node2 5C37 2. Compare F1C9 12D5 E334 12D5 3. Compare 8FF3 9F9D F632 34B7 8FF3 9F9D F632 E1F3 win mac linux bsd win mac linux bsd Comparison of hierarchical checksums in Merkle trees 08.6.17
  • Dynamo: Synchronization, cont’d Merkle         1. Compare node1 3377 node2 5C37 2. Compare F1C9 12D5 E334 12D5 3. Compare 8FF3 9F9D F632 34B7 8FF3 9F9D F632 E1F3 win mac linux bsd 4. Sync win mac linux bsd Comparison of hierarchical checksums in Merkle trees 08.6.17
  • Dynamo: Synchronization, cont’d       1. Compare node1 3377 node2 5C37 2. Compare F1C9 12D5 E334 12D5 3. Compare 8FF3 9F9D F632 34B7 8FF3 9F9D F632 E1F3 win mac linux bsd 4. Sync win mac linux bsd Comparison of hierarchical checksums in Merkle trees 08.6.17
  • Dynamo: Implementation   Java      APIs   HTTP     BDB MySQL       08.6.17
  • Kai: Overview Kai   Dynamo     OpenDynamo Amazon Dynamo    Erlang   memcache API   http://sourceforge.net/projects/kai/   08.6.17
  • Kai: Building Kai   Erlang OTP (>= R12B)   make   Build   % svn co http://kai.svn.sourceforge.net/svnroot/kai/trunk kai % cd kai/ % make % make test make test Makefile RUN_TEST   MacOSX ./configure   08.6.17
  • Kai: Configuration kai.config     Parameter Description Default value logfile hostname port API 11011 memcache_port memcache API 11211 n, r, w N, R, W 3, 2, 2 number_of_buckets 1024 number_of_virtual_nodes 128 08.6.17
  • Kai: Running Kai   % erl -pa src -config kai -kai n 1 -kai r 1 -kai w 1 1> application:load(kai). 2> application:start(kai).   Arguments Description -pa src src Erlang -config kai kai.config -kai n1 -kai r1 -kai w1 N, R, W = 1, 1, 1 memcache 127.0.0.1:11211   08.6.17
  • Kai: Running Kai, cont’d   1 3   Terminal 1 % erl -pa src -config kai -kai port 11011 -kai memcache_port 11211 1> application:load(kai). 2> application:start(kai). Terminal 2 % erl -pa src -config kai -kai port 11012 -kai memcache_port 11212 1> application:load(kai). 2> application:start(kai). Terminal 3 % erl -pa src -config kai -kai port 11013 -kai memcache_port 11213 1> application:load(kai). 2> application:start(kai). 08.6.17
  • Kai: Running Kai, cont’d     3> kai_api:check_node({{127,0,0,1}, 11011}, {{127,0,0,1}, 11012}). 4> kai_api:check_node({{127,0,0,1}, 11012}, {{127,0,0,1}, 11013}). 5> kai_api:check_node({{127,0,0,1}, 11013}, {{127,0,0,1}, 11011}). memcache 127.0.0.1:11211-11213     % % (kai_api:check_node/2 ) 1> kai_api:check_node(NewNode, NodeInCluster). % … % 2> kai_api:check_node(NodeInCluster, NewNode). 08.6.17
  • Kai: Internals Function Module Comments Partitioning kai_hash •  Membership kai_network •  Chord •  kai_membership Coordinator kai_memcache •  kai_coordinator Versioning •  Synchronization kai_sync •  Merkle tree Storage kai_store •  ets Internal API kai_api API kai_memcache •  get, set, delete Logging kai_log Configuration kai_config Supervisor kai_sup 08.6.17
  • Kai: kai_hash     gen_server         Consistent hashing     32bit     gen_server:call     Synopsis kai_hash:start_link(), # {replaced_buckets, ListOfReplacedBuckets} = kai_hash:update_nodes(ListOfNodesToAdd, ListOfNodesToRemove), # Key {nodes, ListOfNodes} = kai_hash:find_nodes(Key). 08.6.17
  • Kai: kai_network, will be renamed to kai_membership     Chord Kademlia gen_fsm     Kademlia BitTorrent     EPMD     Synopsis kai_network:start_link(), # Node kai_network:check_node(Node), # 08.6.17
  • Kai: kai_coordinator, now implemented in kai_memcache     kai_memcache gen_server ( )         kai_memcache       Synopsis ( ) kai_coordinator:start_link(), %N get kai_memcache Data = kai_coordinator:get(Key), %N put Data data kai_coordinator:put(Data). 08.6.17
  • Kai: kai_version     Vector Clocks gen_server ( )         Synopsis ( ) kai_version:start_link(), % VectorClocks LocalNode kai_version:update(VectorClocks, LocalNode), % {order, Order} = kai_version:order(VectorClocks1, VectorClocks2). 08.6.17
  • Kai: kai_sync     gen_fsm                 Merkle tree   Synopsis kai_sync:start_link(), # # kai_sync:update_bucket(Bucket), # 08.6.17
  • Kai: kai_store     gen_server       dets, mnesia, MySQL   Erlang   dets mnesia   ets 4GB   Synopsis kai_store:start_link(), # Retrieves Data associated with Key Data = kai_store:get(Key), % Stores Data, which is a variable of data record kai_store:put(Data). 08.6.17
  • Kai: kai_api     gen_tcp     API     API     kai_hash, kai_store,   TCP   kai_network RPC Synopsis kai_api:start_link(), # Node {node_list, ListOfNodes} = kai_api:node_list(Node), # Node Key Data = kai_api:get(Node, Key). 08.6.17
  • Kai: kai_memcache     cas, stats gen_tcp         API   memcache API get, set,   delete Kai   set exptime Synopsis in Ruby require ‘memcache’   get cache = MemCache.new ‘127.0.0.1:11211’ # ‘key’ ‘value’ cache[‘key’] = ‘value’ # ‘key’ p cache[‘key’] 08.6.17
  • Kai: Testing   make test     common_test         test_server ?   08.6.17
  • Kai: Miscellaneous Node ID   # API {Addr, Port} = {{192,168,1,1}, 11011}.   # -record(data, {key, bucket, last_modified, checksum, flags, value}). # MD5 -record(metadata, {key, bucket, last_modified, checksum}).   # Key ‘undefined’ undefined = kai_store:get(Key). # ’error’ Reason {error, Reason} = function(Args). 08.6.17
  • Kai: Roadmap 1.    Dynamo 2.  Module Task kai_hash kai_coordinator kai_version Vector clocks kai_sync kai_store kai_api kai_memcache cas 08.6.17
  • Kai: Roadmap, cont’d Dynamo 3.  Module Task kai_hash kai_membership Chord Kademlia kai_sync Merkel tree kai_store kai_api kai_memcache stats   configure, test_server   08.6.17
  • Conclusion   http://sourceforge.net/projects/kai/ 08.6.17