Building Distributed
Systems With Riak Core
      Andy Gross (@argv0)
         VP Engineering
             Basho
       DevNation SF 2010
Riak K/V
• Distributed Key-Value Store
• Based on Amazon’s Dynamo
• HTTP and Binary (Protocol Buffers) APIs
• Data access by {Bucket, Key}
• Javascript Map/Reduce
• Link Walking
• Pluggable Storage (Bitcask, InnoDB, ...)
High-Level Dynamo
• Decentralized (no “master” nodes)
• Homogeneous (all nodes can do anything)
• Vector clocks (no reliance on physical time)
• Gossip Protocol (no global state)
• Consistent Hashing for replica placement
  (a local calculation for each node)
N, R, W Values
• N = number of replicas to store (on
  distinct nodes)
• R = number of replica responses needed
  for a successful read (specified per-request)
• W = number of replica responses needed
  for a successful write (specified per-
  request)
Harvesting A
         Framework
• We noticed that Riak code fell into one of
  two categories
  • Code specific to K/V storage
  • “generic” distributed systems code
• So we split Riak into K/V and Core
Distributed
        Coordination
• Making many machines act like one
• Division of labor
• Load balancing
• State storage
• Mutual exclusion/locking
Riak Core Applications

    Your App       Riak K/V



           Riak Core
Riak Core Applications
             Your App
   Your
   App       Riak K/V


          Riak Core
Riak Core Abstractions

• Virtual Nodes
• Preference Lists
• Ring Event Watchers
• Node Event Watchers
Virtual Nodes

• Primary actor in a Dynamo-based system
• Handles load for (1/num_partitions)
• Implements commands dispatched from
  clients
• Handles handoff when nodes join/leave
Preference Lists
• Lists of virtual nodes obtained by hashing a
  request (document, sessionid, etc).
• Allows any node to compute document
  locations
• Central to replication in Riak
• Down nodes are filtered out, replaced with
  next-best nodes in the ring.
Ring Event Watchers

• Notified when ring state changes due to
  node addition/removal
• API: ring_update(NewRing)
• Can modify ring state in an app-specific
  fashion
Node Event Watchers

• Nodes run and advertise “services”
• API: service_update(Services)
• Active service list used to generate per-app
  preference lists.
Use cases
• If distributed systems isn’t your core
  business, outsource it!
• Providing a distribution layer on top of
  non-distributed systems like:
  • Couch, Redis, Memcached
• Implementing your own systems.
Current Status and
       Roadmap
• Erlang-only now, but not for long (HTTP
  and PB APIs coming)
• Some harvesting left to do (versioned
  objects, ring/node handler utilities)
• Project templates - skeleton code for
  writing Riak Core-based systems.
• Stronger consistency models (with a Paxos/
  ZAB-like protocol)
Thanks!

• http://wiki.basho.com
• http://github.com/basho
• http://twitter.com/basho/team
• irc://freenode.net/#riak
• Riak SF Meetup (on meetup.com)
• Visit us! 795 Folsom @ 4th (Twitter Bldg.)

Building Distributed Systems With Riak and Riak Core

  • 1.
    Building Distributed Systems WithRiak Core Andy Gross (@argv0) VP Engineering Basho DevNation SF 2010
  • 2.
    Riak K/V • DistributedKey-Value Store • Based on Amazon’s Dynamo • HTTP and Binary (Protocol Buffers) APIs • Data access by {Bucket, Key} • Javascript Map/Reduce • Link Walking • Pluggable Storage (Bitcask, InnoDB, ...)
  • 3.
    High-Level Dynamo • Decentralized(no “master” nodes) • Homogeneous (all nodes can do anything) • Vector clocks (no reliance on physical time) • Gossip Protocol (no global state) • Consistent Hashing for replica placement (a local calculation for each node)
  • 4.
    N, R, WValues • N = number of replicas to store (on distinct nodes) • R = number of replica responses needed for a successful read (specified per-request) • W = number of replica responses needed for a successful write (specified per- request)
  • 8.
    Harvesting A Framework • We noticed that Riak code fell into one of two categories • Code specific to K/V storage • “generic” distributed systems code • So we split Riak into K/V and Core
  • 9.
    Distributed Coordination • Making many machines act like one • Division of labor • Load balancing • State storage • Mutual exclusion/locking
  • 10.
    Riak Core Applications Your App Riak K/V Riak Core
  • 11.
    Riak Core Applications Your App Your App Riak K/V Riak Core
  • 12.
    Riak Core Abstractions •Virtual Nodes • Preference Lists • Ring Event Watchers • Node Event Watchers
  • 13.
    Virtual Nodes • Primaryactor in a Dynamo-based system • Handles load for (1/num_partitions) • Implements commands dispatched from clients • Handles handoff when nodes join/leave
  • 14.
    Preference Lists • Listsof virtual nodes obtained by hashing a request (document, sessionid, etc). • Allows any node to compute document locations • Central to replication in Riak • Down nodes are filtered out, replaced with next-best nodes in the ring.
  • 15.
    Ring Event Watchers •Notified when ring state changes due to node addition/removal • API: ring_update(NewRing) • Can modify ring state in an app-specific fashion
  • 16.
    Node Event Watchers •Nodes run and advertise “services” • API: service_update(Services) • Active service list used to generate per-app preference lists.
  • 17.
    Use cases • Ifdistributed systems isn’t your core business, outsource it! • Providing a distribution layer on top of non-distributed systems like: • Couch, Redis, Memcached • Implementing your own systems.
  • 18.
    Current Status and Roadmap • Erlang-only now, but not for long (HTTP and PB APIs coming) • Some harvesting left to do (versioned objects, ring/node handler utilities) • Project templates - skeleton code for writing Riak Core-based systems. • Stronger consistency models (with a Paxos/ ZAB-like protocol)
  • 19.
    Thanks! • http://wiki.basho.com • http://github.com/basho •http://twitter.com/basho/team • irc://freenode.net/#riak • Riak SF Meetup (on meetup.com) • Visit us! 795 Folsom @ 4th (Twitter Bldg.)