Dynomite at Erlang Factory

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    3 Favorites

    Dynomite at Erlang Factory - Presentation Transcript

    1. Dynomite Yet Another Distributed Key Value Store
    2. @moonpolysoft - questions
    3. A Crowded Field • Cassandra • Lightcloud • Memcachedb • Redis • Tokyo Tyrannical Cabinet Device Thing • Voldemort
    4. Who here has written one?
    5. Alpha - 0.6.0
    6. Focus on Distribution
    7. Focus on Performance
    8. • Latency • Average ~ 10 ms • Median ~ 5 ms • 99.9% ~ 1 s
    9. • Throughput • R12B ~ 2,000 req/s • R13B ~ 6,500 req/s
    10. At Powerset • 12 machine production cluster • ~6 million images + metadata • ~2 TB of data including replicas • ~139KB average size
    11. Production? • Data you can afford to lose • Compatibility will break • Migration will be provided
    12. The Constants
    13. Consistency Throughput Durability N – Replication
    14. Max Replicas per Partition
    15. Latency Consistency R – Read Quorum
    16. Minimum participation for read
    17. Latency Durability W – Write Quorum
    18. Minimum participation for write
    19. Throughput Scalability Q – Partitioning
    20. Partitions = 2Q
    21. QNRW – Defines your cluster
    22. At Powerset • Batch writes • Online reads • Wikipedia pages are obese
    23. Q – 6 N–3 R–1 W–2
    24. (dynomite@galva)2> mediator:put( \"prefs:user:merv\", undefined, term_to_binary( [{safe_search, false}, {results, 50}])).
    25. (dynomite@galva)2> mediator:put( \"prefs:user:merv\", undefined, term_to_binary( [{safe_search, false}, {results, 50}])).
    26. (dynomite@galva)2> mediator:put( \"prefs:user:merv\", undefined, term_to_binary( [{safe_search, false}, {results, 50}])).
    27. (dynomite@galva)2> mediator:put( \"prefs:user:merv\", undefined, term_to_binary( [{safe_search, false}, {results, 50}])).
    28. Value is always Binary
    29. (dynomite@galva)2> mediator:put( \"prefs:user:merv\", undefined, term_to_binary( [{safe_search, false}, {results, 50}])).
    30. (dynomite@galva)1> {ok, {Context, [Bin]}} = mediator:get(\"prefs:user:merv\"). {ok,{[{\"<0.630.0>\",1240543271.698089}], [<<131,108,0,0,0,2,...>>]}} (dynomite@galva)2> Terms = binary_to_term(Bin). [{safe_search,false},{results,50}]
    31. (dynomite@galva)1> {ok, {Context, [Bin]}} = mediator:get(\"prefs:user:merv\"). {ok,{[{\"<0.630.0>\",1240543271.698089}], [<<131,108,0,0,0,2,...>>]}} (dynomite@galva)2> Terms = binary_to_term(Bin). [{safe_search,false},{results,50}]
    32. mediator:delete/1 mediator:has_key/1
    33. Up In Dem Gutz
    34. Client Protocols • Native Erlang Messages • Thrift • Protocol Buffers • Ascii Protocol
    35. Mediator
    36. N – 3 ;R – 2
    37. pmap(Fun, List, ReturnNum) -> N = if ReturnNum > length(List) -> length(List); true -> ReturnNum end, SuperParent = self(), SuperRef = erlang:make_ref(), Ref = erlang:make_ref(), %% we spawn an intermediary to collect the results %% this is so that there will be no leaked messages sitting in our mailbox Parent = spawn(fun() -> L = gather(N, length(List), Ref, []), SuperParent ! {SuperRef, pmap_sort(List, L)} end), Pids = [spawn(fun() -> Parent ! {Ref, {Elem, (catch Fun(Elem))}} end) || Elem <- List], Ret = receive {SuperRef, Ret} -> Ret end, % i think we need to cleanup here. lists:foreach(fun(P) -> exit(P, die) end, Pids),
    38. Membership
    39. Handles Nodes Joining
    40. membership:nodes() =/= nodes().
    41. • Receive join notification • Recalculate partition to node mapping • Start bootstrap routines • Stops old local storage servers • Starts new local storage servers • Install the new partition table
    42. Maps Partitions To Nodes
    43. Gossip Girl
    44. • Gossip servers randomly wake up • Pick another membership server • Merge membership tables • Versioned by vector clocks
    45. gossip_loop(Server) -> #membership{nodes=Nodes,node=Node} = gen_server:call(Server, state), case lists:delete(Node, Nodes) of [] -> ok; % no other nodes Nodes1 when is_list(Nodes1) -> fire_gossip(random_node(Nodes1)) end, SleepTime = random:uniform(5000) + 5000, receive stop -> gossip_paused(Server); _Val -> ok after SleepTime -> ok end, gossip_loop(Server).
    46. Storage Server
    47. Merkle Trees
    48. Merkle Trees
    49. O(log n)
    50. • Implemented on disk as a B-Tree • Includes buddy-block allocator for key space • Extremely gnarly code
    51. Minimal Transfer
    52. Sync Server
    53. Randomly Wakes Up
    54. Choses two replicas
    55. Transfers Merkle Diff
    56. [cli@yourmom ~]$ dynomite console (dynomite@yourmom)1> sync_manager:running(). [{3623878657,'dynomite@yourmom','dynomite@yourd ad'}, {3892314113,'dynomite@yourmom','dynomite@cableg uy'},
    57. Problem Areas
    58. Membership
    59. Problems • New partitions online before they are ready • A priori knowledge of what is running • Possible to dilute a cluster into uselessness • Migrations are tremendously painful
    60. Direction • Storage servers rally with local membership server • Use monitors on local storage • Alerts for dangerous situations
    61. Merkle
    62. Erlang is good at many things
    63. Low level storage ain’t one
    64. Direction • Rewrite Dmerkle storage in C • Use async driver pool • Build in more crash recovery to the format
    65. Web Console
    66. Direction • More detailed visualizations of cluster health • Perform “safe” ops tasks
    67. And other things I haven’t thought of
    68. Demo!
    69. Spare a patch, friend?
    70. • http://github.com/cliffmoon/dynomite • http://wiki.github.com/cliffmoon/dynomite
    71. Q and or A

    + moonpolysoftmoonpolysoft, 6 months ago

    custom

    1169 views, 3 favs, 0 embeds more stats

    This is my presentation from the 2009 Erlang Factor more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 1169
      • 1169 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 3
    • Downloads 17
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories