Embrace NoSQL and
        Eventual Consistency
             with Ripple
               Sean Cribbs
            Basho Technologies
               @seancribbs


basho
Understanding
Eventual Consistency
Internet
Internet
Internet
Internet
Internet



CDN
Internet



CDN
Internet



CDN




      M     S


            S
Internet



CDN
Is this system
 consistent?
You sacrificed ACID
  as soon as you
 added caching.
How does your
application cope?
Inconsistency Creep
Inconsistency Creep

     Cache staleness
Inconsistency Creep

       Cache staleness
  Master-slave replication lag
Inconsistency Creep

       Cache staleness
  Master-slave replication lag
     CDN propagation lag
Inconsistency Creep

       Cache staleness
  Master-slave replication lag
     CDN propagation lag
       Race-conditions
Inconsistency Creep

       Cache staleness
  Master-slave replication lag
     CDN propagation lag
       Race-conditions
        Service failures
Eventual Consistency
Eventual Consistency

Tolerates temporary inconsistency
Eventual Consistency

Tolerates temporary inconsistency
    Moves toward consistency
          automatically
Eventual Consistency

Tolerates temporary inconsistency
    Moves toward consistency
          automatically
  Resilient, maintains availability
Eventual Consistency

Tolerates temporary inconsistency
    Moves toward consistency
          automatically
  Resilient, maintains availability
  “Eventual” in seconds, not days
Eventual Consistency

Tolerates temporary inconsistency
    Moves toward consistency
          automatically
  Resilient, maintains availability
  “Eventual” in seconds, not days
Real-World
Eventual Consistency
Real-World
Eventual Consistency

        DNS
Real-World
Eventual Consistency

        DNS
        LDAP
Real-World
Eventual Consistency

        DNS
        LDAP
       The Web
Harvest & Yield
Harvest & Yield

 CAP “For Dummies”
Harvest & Yield

      CAP “For Dummies”
Harvest: how much of dataset is
  reflected in a response (C)
Harvest & Yield

      CAP “For Dummies”
Harvest: how much of dataset is
  reflected in a response (C)
Yield: how likely is the datastore
     to complete request (A)
Harvest & Yield
http://codahale.com/you-cant-sacrifice-partition-
                   tolerance/

          CAP “For Dummies”
 Harvest: how much of dataset is
   reflected in a response (C)
 Yield: how likely is the datastore
      to complete request (A)
# Do I need it?
def eventual_consistency?
  @money === @uptime
end
# Do I need it?
def eventual_consistency?
  @money === @uptime
end

# “Any sufficiently large
# system is in a constant
# state of failure.”
#   --Justin Sheehy, Basho CTO
Ok, I’ll use Eventual
Consistency. How?
Oh, those NoSQL
     things.
NoSQL Flavors
NoSQL Flavors
Graph (Neo4J, InfiniteGraph)
NoSQL Flavors
Graph (Neo4J, InfiniteGraph)
Document (Mongo, Couch)
NoSQL Flavors
Graph (Neo4J, InfiniteGraph)
Document (Mongo, Couch)
Column (HBase, Cassandra)
NoSQL Flavors
Graph (Neo4J, InfiniteGraph)
Document (Mongo, Couch)
Column (HBase, Cassandra)
Key-Value (Riak,
Voldemort)
NoSQL Flavors
 structured
         Graph (Neo4J, InfiniteGraph)
         Document (Mongo, Couch)
         Column (HBase, Cassandra)
         Key-Value (Riak,
         Voldemort)
unstructured
Not all NoSQLs are
Eventually Consistent.
Riak implements
Eventual Consistency.
What is Riak?
Riak is...
Riak is...

a scalable
Riak is...

   a scalable
highly-available
Riak is...

      a scalable
   highly-available
networked/distributed
Riak is...

      a scalable
   highly-available
networked/distributed
   key-value store
Riak’s Data Model
Riak’s Data Model

Stores values against keys
Riak’s Data Model

Stores values against keys
   Values are opaque
Riak’s Data Model

  Stores values against keys
     Values are opaque
Keys are grouped into buckets
Values Have Metadata
Values Have Metadata
      Content-Type
Values Have Metadata
      Content-Type
     Last-Modified
Values Have Metadata
      Content-Type
     Last-Modified
          Link
Values Have Metadata
       Content-Type
       Last-Modified
            Link
    Secondary index values
Values Have Metadata
       Content-Type
       Last-Modified
            Link
    Secondary index values
      Custom metadata
Values Have Metadata
        Content-Type
       Last-Modified
            Link
    Secondary index values
       Custom metadata
 Value + Metadata == “Object”
Basic Operations
Basic Operations

 GET /riak/bucket/key
Basic Operations

 GET /riak/bucket/key
 PUT /riak/bucket/key
Basic Operations

 GET /riak/bucket/key
 PUT /riak/bucket/key
DELETE /riak/bucket/key
Basic Operations

 GET /riak/bucket/key
 PUT /riak/bucket/key
DELETE /riak/bucket/key
 HTTP or Protocol Buffers
Other Query Methods
Other Query Methods

 Link-traversal (“walking”)
Other Query Methods

 Link-traversal (“walking”)
     Full-text Search
Other Query Methods

 Link-traversal (“walking”)
     Full-text Search
    Secondary Indexes
Other Query Methods

 Link-traversal (“walking”)
     Full-text Search
    Secondary Indexes
        MapReduce
Ripple connects
  Ruby to Riak.
# gem install riak-client
require ‘riak’

# Connect to Riak on localhost
client = Riak::Client.new

# Fetch something, write it to a file
o = client[‘slide-decks’][‘RubyConfAR’]

File.open(‘slides.key’,‘wb’) do |f|
  f.write o.raw_data
end
# Create a new key
v = client[‘venues’].new(‘RubyConfAR’)

# Set the value, JSON by default
v.data = {
  name: ‘Ciudad Cultural Konex’,
  address: {
    street: ‘Sarmiento 3131’,
    city: ‘Buenos Aires’,
    zip: ‘C1196AAG’
  }
}

# Write it to Riak
v.store
# Create a new key for an image
pic = client[‘pictures’].new(‘bue1.jpg’)

# Read the file, set the content-type
pic.raw_data = File.read(‘bue1.jpg’)
pic.content_type = ‘image/jpeg’

# Set some metadata
pic.meta[‘camera’] = ‘Nikon’
pic.indexes[‘date_int’] << Time.now.to_i

# Write it to Riak
pic.store
But I want models!
 Ok, no problem.
require ‘ripple’ # gem install ripple

class Person
  include Ripple::Document
  property :name, String, presence: true
  key_on :name
  many :addresses
end

class Address
  include Ripple::EmbeddedDocument

  property :street, String
  property :city, String
  property :zip, String
end
# Create my info
me = Person.new(
        name: ‘Sean Cribbs’,
        addresses: [{
             street: ‘929 Market St’,
             city: ‘San Francisco’,
             zip: ‘94103’
          }]
      )

me.save! # Store as JSON under
         # “people/Sean Cribbs”

me == Person.find(‘Sean Cribbs’) # true
That’s kind of cool.
    But what about
Eventual Consistency?
Riak uses Causal
Eventual Consistency.
   also called Vector Clocks
Causal Consistency
Causal Consistency
 Each read includes a token of the
    current state (vector clock)
Causal Consistency
   Each read includes a token of the
      current state (vector clock)
Clients send the token back with writes
Causal Consistency
   Each read includes a token of the
      current state (vector clock)
Clients send the token back with writes
Riak updates the token and detects stale
 versions and conflicts by comparison
Causal Consistency
   Each read includes a token of the
      current state (vector clock)
Clients send the token back with writes
Riak updates the token and detects stale
 versions and conflicts by comparison
     Conflicts are exposed to your
              application
Conflicts? That
 sounds bad.
It happens frequently without strict
   consistency. For example, have
two threads write at the same time.
   Riak detects this conflict for you.
Your application is also
  part of the Eventually
  Consistent system. It
makes the hard decisions
    that Riak can’t.
Ok. So how do I
resolve conflicts?
Riak will give you all values
   that can’t be resolved
  automatically. Just write
 back the resolved value.
Think of it like merging or
  rebasing git branches.
 Sometimes you MUST use
git mergetool to fix things.
Two Ways to Resolve
Two Ways to Resolve


 Semantic Resolution
Two Ways to Resolve


 Semantic Resolution
       CRDTs
Semantic Resolution
Semantic Resolution


 Given any number of values,
compute the resolved version
    using business logic.
Example
Example

 If your value is a shopping cart,
merge all items and sum quantities
         into a single cart.
Example

 If your value is a shopping cart,
merge all items and sum quantities
         into a single cart.
      Maybe the customer
      buys more that way!
Ripple supports
Semantic Resolution.
# A cart is { product_id => quantity }
Riak::RObject.on_conflict do |obj|
  if obj.bucket.name == ‘carts’
    obj.content_type = ‘application/json’
    obj.data = {}
    obj.siblings.each do |val|
      val.data.each do |product,q|
        obj.data[product] ||= 0
        obj.data[product] += q
      end
    end
    obj
  end
end
# Documents can be resolved too
Person.on_conflict(:addresses) do |vals|
  vals.each do |doc|
    self.addresses &= doc.addresses
  end
end
Commutative
Replicated Data-Types




http://hal.inria.fr/inria-00397981/en/
Commutative
Replicated Data-Types
        Recent area of research




http://hal.inria.fr/inria-00397981/en/
Commutative
Replicated Data-Types
        Recent area of research
          Data-type defines
       commutative operations




http://hal.inria.fr/inria-00397981/en/
Commutative
Replicated Data-Types
        Recent area of research
          Data-type defines
       commutative operations
         Value contains a state
   and limited history of operations



http://hal.inria.fr/inria-00397981/en/
Commutative
Replicated Data-Types
        Recent area of research
          Data-type defines
       commutative operations
         Value contains a state
   and limited history of operations
     Roll-back, merge and replay
              to resolve
http://hal.inria.fr/inria-00397981/en/
Currently, only Erlang
has a CRDT for Riak,
 called statebox.
We’re researching how
to bring CRDTs to Ruby
 and other languages.
Conclusion
Conclusion
   Any sufficiently large system is
inconsistent and constantly failing
Conclusion
    Any sufficiently large system is
 inconsistent and constantly failing
Riak remains available during failures,
   progresses toward consistency
Conclusion
    Any sufficiently large system is
 inconsistent and constantly failing
Riak remains available during failures,
   progresses toward consistency
  Riak & Ripple help your application
     recover from inconsistency
Abrazámonos Riak
Embrace NoSQL and Eventual Consistency with Ripple

Embrace NoSQL and Eventual Consistency with Ripple