SlideShare a Scribd company logo
1 of 28
Download to read offline
Conflict resolution in Kai


    Shun'ichi Shinohara (shino)

  @ Erlang/distributed workshop #2
  2008-11-20 (Thu), KLab, Tokyo



             Copyright by shino
    http://creativecommons.org/licenses/by-sa/2.1/jp/deed.en_US
Goals
☞ Learn that events are NOT total ordered,
  but can be “concurrent” in a distributed
  system.

☞ Understand some points on conflict
  resolution in Amazon's Dynamo.

☞ Discuss on Kai's designs and implementations
  concerning conflict resolution.

                                                 2
Contents (1/2)
I.Lamport's logical clock
  ●
      “happens-before” and “concurrent”.
  ●
      Construction and features.
  ●
      Gleaning.

II.Vector Clocks
  ●
      Construction.
  ●   keeping slender (practice).


                                           3
Contents (2/2)
III.Some points on Amazon's Dynamo
  ●
      Coordinators increment vector clocks.
  ●
      Client-side conflict resolution.
  ●
      Multiple versions in a single node.

IV.Discussions on Kai's conflict resolution
  ●
      Packing checksums to “cas_unique” (lossy).
  ●   How/Whether to merge a set of vector clocks.
  ●   Consistency in putting data.

                                                     4
I. Lamport's logical clock
●   Idea
    ●
        Two events are NOT ALWAYS ordered in a
        distributed system.
    ●
        Orders for causally-related events are only
        meaningful.

    ●
        Inspired by A. Einstein's special relativity.




                                                        5
Causality in Special Relativity

                              ●   Three types of
                                  regions
                                  ●   Past,
                                  ●   Future,
                                  ●
                                      Simultaneous.
                              ●   Simultaneous is
                                  ●   not a plane,
                                  ●   but a region.

        http://en.wikipedia.org/wiki/Light_cone
                                                      6
Problem setting
●   Processes communicate by passing messages.
●   Messages <==> causal relationships.




         http://commons.wikimedia.org/wiki/Image:Lamport-Clock-en.svg
                                                                   7
“happens-before” (->)
●
    In a single process, -> means the “normal”
    order, e.g. A1 -> A2.
●   E1/E2 are sending/receipt of the same message,
    then “E1 -> E2”, e.g. A2 -> B4.
●   “E1 -> E2 and E2 -> E3” then “E1 -> E3”, e.g.
    A1 → B4.




                                                 8
“concurrent” (||)
●   “not (A -> B)” and “not (B -> A)”.
●   e.g. A2 and B3, B4 and C2.




                                         9
Construction
●   Increment when any event occurs, and
●   Message receivers increment their clock to
    ●
        max(present clock, sender's clock) + 1.




            http://commons.wikimedia.org/wiki/Image:Lamport-Clock-en.svg
                                                                      10
Features
●   When A -> B, C(A) < C(B).           ---(*)
    ●
        Clock order is consistent with causality.
●   But: C(A) < C(B) does not imply A -> B.
    ●
        Clock order does NOT determine a unique
        causality-relation.
●   Just: C(A) <= C(B) implies “not (B -> A)”
    ●   The contraposition of (*).
    ●
        “not (B -> A)” is equivalent “A -> B or A || B”.


                                                       11
Gleaning
●   Extension to total ordering.
    ●
        Label an event as (process id, logical clock).
    ●
        An (arbitrary) total order in {process id}.
    ●
        (P(A), C(A)) < (P(B), C(B)) if and only if
        ●   C(A) < C(B), or
        ●   C(A) = C(B) and P(A) < P(B).
●   Example: mutex algorithm
    ●
        Non centralized, but all nodes should participate.
    ●
        Consult the paper for details :-)

                                                         12
II. Vector Clocks
●   Vector clocks provides the clock s.t.
    ●
        VC(A) < VC(B) then A -> B,
    ●
        VC(A) || VC(B) then A || B,
    ●
        Where VC(A) is the vector clock of an event A.

●   It determines causality relation, not only
    “happens-before”, but also “concurrent” for
    events.
●   In practice: keeping them slender.

                                                         13
Construction
●   A vector clock: a set of {process id, clock}.
●   Increment own clock and merge (as follows).




          http://commons.wikimedia.org/wiki/Image:Vector_Clock.svg 14
Keeping slender (practice)
●   Element size can be bloated.
●   As a size = #(related nodes) grows,
    ●
        Comparison slows down: not so bad (usually),
    ●
        Data consumes space (memory/storage): bad,
    ●
        Messages become fat: TOO BAD!

●   Related nodes should be as few as possible.



                                                       15
III. Some points from Amazon's
                 Dynamo
☞ Coordinators increment vector clocks.
☞ Client-side conflict resolution.
☞ Multiple versions in a single node.


●   This part includes my guess.




                                          16
Coordinators increment vector
                    clocks
●   In section 4.3 in the Dynamo paper.
    ●
        4.3. Replication
         ●   “Each key, k, is assigned to a coordinator node
             [...]. The coordinator is in charge of the replication
             of the data items that fall within its range.”
●   Without failures, every vector clock has just
    one element.
●   Suppress vector clocks to be fat.



                                                                      17
Client-side conflict resolution
●   If concurrent versions exist in server-side,
    ●
        A Client receives multiple data, each with vector
        clocks,
    ●
        The client merge
         ●
             Not only data (e.g. for hash-type data, just “merge”
             them),
         ●
             But vector clocks.

●   Servers don't know data schema.
●   Both servers and clients know vector clock
    manipulation.
                                                               18
Multiple versions in a single node
●   In section 6 and 6.4
    ●
        6. Experiences & Lessons Learned
        ●   “All the measurements presented in this section were
            taken on a live system operating with a configuration
            of (3,2,2)”
    ●
        6.3 Divergent Versions: When and How Many?
        ●   the number of versions returned [..]
            99.94% of requests saw exactly one version; [..]
            and 0.00009% of requests saw 4 versions.
●   A single node stores multiple conflicting data
    possibly.

                                                                    19
IV. Discussions on Kai's
            implementation
☞ Packing checksums to “cas_unique”
☞ How/Whether to merge a set of vector
  clocks
☞ Consistency in putting data




                                         20
Packing checksums to “cas_unique”
●   Kai uses memcached protocol between clients
    and servers
●   Memcached's “gets” and “cas” commands uses
    “cas_unique” field (uint64) for optimistic
    lock
●   Mapping of Dynamo operations to memcached
    command:
    ●
        GET => gets,
    ●
        PUT => cas.


                                              21
Packing checksums to “cas_unique”
●   Packing one checksum (no concurrent data).
    ●
        <<1:4, [data's checksum]:60>>
●   Packing two checksums.
    ●
        <<2:4, [data1's checksum]:30,
                 [data2's checksum]:30>>
●   General cases are
    ●   First 4 bit: #(data),
    ●
        [Each data's checksum]:(60/#(data)),
    ●
        Zero padding rest.
●   Can ONLY check consistency (lossy).
                                                 22
How/Whether to merge a set of
            vector clocks
●   Clients don't merge vector clocks in Kai
●   Who merges them???
●   kai_coordinator can merge the clocks in cas,
    ●   At the cost of some message delay.
        ●   Receive “cas” command,
        ●   Collect data from replica nodes (just as “get”),
        ●   Check consistency, cas_unique <==> vector clocks,
        ●   Merge vector clocks,
        ●   Store data in each replica node.
    ●   Spoils the advantage of (N,R,W) flexibility.
                                                                23
Alternatives (1/2)
●   Break the memcached protocol in
    ●
      “The server will transmit back unstructured
      data in exactly the same way it received it”
      (from the memcached protocol).
●   Sketch of implementation:
    ●
        Merge related vector clocks,
    ●
        Serialize the merged vector clock,
    ●   Use cas_unique to represents its length
    ●   Concatenate the serialized vector clock and the
        user data and send in a data block.
                                                          24
Alternatives (2/2)
●   Server side resolution,
    ●
        e.g. a user can add-on a module of resolution
        logic.
●   Other protocols,
    ●
        e.g. HTTP, XMPP(i don't know much).
    ●
        Memcached's client libraries by C and various
        language wrappers are appealing.
●   Nice idea?


                                                        25
Consistency in putting data
●   Problem
    ●
        Operations go on multiple nodes.
    ●
        How to guarantee the consistency between
        updating vector clocks and storing data?

●   “Multiple data in a single node” can help?
    ●
        Appropriate to Dynamo's “always writable” policy.




                                                        26
Summary
☞ Causality in a distributed system has only
    partial order.

☞ Vector clocks is easy to implement/use, but
    should pay attention to keep them slender.

☞ “Multiple versions in single node” seems
    nice.

●   Please discuss anything later and at MLs.    27
References
●   Lamport's logical clock
    ●     http://research.microsoft.com/users/lamport/pubs/pubs.html#time-clocks
    ●     http://www.cs.tsukuba.ac.jp/~yas/sie/csys-2007/2008-02-29/
●   Vector clocks
    ●     http://www.vs.inf.ethz.ch/publ/papers/VirtTimeGlobStates.pdf
    ●     http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.47.7435
    ●
          http://www.slideshare.net/takemaru/kai-an-open-source-implementation-of-amazons-dynamo-472179
●   Dynamo
    ●     http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html
●   Kai
    ●     http://kai.wiki.sourceforge.net/
●   Memcached protocol
    ●     http://code.sixapart.com/svn/memcached/trunk/server/doc/protocol.txt


                                                                                                          28

More Related Content

What's hot

Clockless design language - ilia greenblat
Clockless design language - ilia greenblatClockless design language - ilia greenblat
Clockless design language - ilia greenblatchiportal
 
Java MySQL Connector & Connection Pool Features & Optimization
Java MySQL Connector & Connection Pool Features & OptimizationJava MySQL Connector & Connection Pool Features & Optimization
Java MySQL Connector & Connection Pool Features & OptimizationKenny Gryp
 
Facebook C++网络库Wangle调研
Facebook C++网络库Wangle调研Facebook C++网络库Wangle调研
Facebook C++网络库Wangle调研vorfeed chen
 
Jvm Performance Tunning
Jvm Performance TunningJvm Performance Tunning
Jvm Performance Tunningguest1f2740
 
Grand Central Dispatch - iOS Conf SG 2015
Grand Central Dispatch - iOS Conf SG 2015Grand Central Dispatch - iOS Conf SG 2015
Grand Central Dispatch - iOS Conf SG 2015Ben Asher
 
Javaone 2016 : Supercharge your (reactive) Streams
Javaone 2016 : Supercharge your (reactive) StreamsJavaone 2016 : Supercharge your (reactive) Streams
Javaone 2016 : Supercharge your (reactive) StreamsJohn McClean
 
Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)
Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)
Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)DECK36
 
Fatkulin presentation
Fatkulin presentationFatkulin presentation
Fatkulin presentationEnkitec
 
Geneva JUG - Cassandra for Java Developers
Geneva JUG - Cassandra for Java DevelopersGeneva JUG - Cassandra for Java Developers
Geneva JUG - Cassandra for Java DevelopersMichaël Figuière
 
MySQL Parallel Replication by Booking.com
MySQL Parallel Replication by Booking.comMySQL Parallel Replication by Booking.com
MySQL Parallel Replication by Booking.comJean-François Gagné
 
Varnish in action phpday2011
Varnish in action phpday2011Varnish in action phpday2011
Varnish in action phpday2011Combell NV
 
Apache zookeeper 101
Apache zookeeper 101Apache zookeeper 101
Apache zookeeper 101Quach Tung
 
Varnish in action phpuk11
Varnish in action phpuk11Varnish in action phpuk11
Varnish in action phpuk11Combell NV
 
Performance tuning jvm
Performance tuning jvmPerformance tuning jvm
Performance tuning jvmPrem Kuppumani
 
Pune-Cocoa: Blocks and GCD
Pune-Cocoa: Blocks and GCDPune-Cocoa: Blocks and GCD
Pune-Cocoa: Blocks and GCDPrashant Rane
 
MariaDB Server on macOS - FOSDEM 2022 MariaDB Devroom
MariaDB Server on macOS -  FOSDEM 2022 MariaDB DevroomMariaDB Server on macOS -  FOSDEM 2022 MariaDB Devroom
MariaDB Server on macOS - FOSDEM 2022 MariaDB DevroomValeriy Kravchuk
 
The Full MySQL and MariaDB Parallel Replication Tutorial
The Full MySQL and MariaDB Parallel Replication TutorialThe Full MySQL and MariaDB Parallel Replication Tutorial
The Full MySQL and MariaDB Parallel Replication TutorialJean-François Gagné
 

What's hot (18)

Clockless design language - ilia greenblat
Clockless design language - ilia greenblatClockless design language - ilia greenblat
Clockless design language - ilia greenblat
 
Java MySQL Connector & Connection Pool Features & Optimization
Java MySQL Connector & Connection Pool Features & OptimizationJava MySQL Connector & Connection Pool Features & Optimization
Java MySQL Connector & Connection Pool Features & Optimization
 
Facebook C++网络库Wangle调研
Facebook C++网络库Wangle调研Facebook C++网络库Wangle调研
Facebook C++网络库Wangle调研
 
无锁编程
无锁编程无锁编程
无锁编程
 
Jvm Performance Tunning
Jvm Performance TunningJvm Performance Tunning
Jvm Performance Tunning
 
Grand Central Dispatch - iOS Conf SG 2015
Grand Central Dispatch - iOS Conf SG 2015Grand Central Dispatch - iOS Conf SG 2015
Grand Central Dispatch - iOS Conf SG 2015
 
Javaone 2016 : Supercharge your (reactive) Streams
Javaone 2016 : Supercharge your (reactive) StreamsJavaone 2016 : Supercharge your (reactive) Streams
Javaone 2016 : Supercharge your (reactive) Streams
 
Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)
Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)
Our Puppet Story – Patterns and Learnings (sage@guug, March 2014)
 
Fatkulin presentation
Fatkulin presentationFatkulin presentation
Fatkulin presentation
 
Geneva JUG - Cassandra for Java Developers
Geneva JUG - Cassandra for Java DevelopersGeneva JUG - Cassandra for Java Developers
Geneva JUG - Cassandra for Java Developers
 
MySQL Parallel Replication by Booking.com
MySQL Parallel Replication by Booking.comMySQL Parallel Replication by Booking.com
MySQL Parallel Replication by Booking.com
 
Varnish in action phpday2011
Varnish in action phpday2011Varnish in action phpday2011
Varnish in action phpday2011
 
Apache zookeeper 101
Apache zookeeper 101Apache zookeeper 101
Apache zookeeper 101
 
Varnish in action phpuk11
Varnish in action phpuk11Varnish in action phpuk11
Varnish in action phpuk11
 
Performance tuning jvm
Performance tuning jvmPerformance tuning jvm
Performance tuning jvm
 
Pune-Cocoa: Blocks and GCD
Pune-Cocoa: Blocks and GCDPune-Cocoa: Blocks and GCD
Pune-Cocoa: Blocks and GCD
 
MariaDB Server on macOS - FOSDEM 2022 MariaDB Devroom
MariaDB Server on macOS -  FOSDEM 2022 MariaDB DevroomMariaDB Server on macOS -  FOSDEM 2022 MariaDB Devroom
MariaDB Server on macOS - FOSDEM 2022 MariaDB Devroom
 
The Full MySQL and MariaDB Parallel Replication Tutorial
The Full MySQL and MariaDB Parallel Replication TutorialThe Full MySQL and MariaDB Parallel Replication Tutorial
The Full MySQL and MariaDB Parallel Replication Tutorial
 

Viewers also liked

Polymer & the web components revolution 6:25:14
Polymer & the web components revolution 6:25:14Polymer & the web components revolution 6:25:14
Polymer & the web components revolution 6:25:14mattsmcnulty
 
Downtown & Infill Tax Increment Districts: Strategies for Success
Downtown & Infill Tax Increment Districts: Strategies for SuccessDowntown & Infill Tax Increment Districts: Strategies for Success
Downtown & Infill Tax Increment Districts: Strategies for SuccessVierbicher
 
Appraisal and Performance Management in Schools - A practical approach
Appraisal and Performance Management in Schools - A practical approachAppraisal and Performance Management in Schools - A practical approach
Appraisal and Performance Management in Schools - A practical approachMark S. Steed
 
The Economics of Green Building
The Economics of Green BuildingThe Economics of Green Building
The Economics of Green Buildingnilskok
 
The Etsy Shard Architecture: Starts With S and Ends With Hard
The Etsy Shard Architecture: Starts With S and Ends With HardThe Etsy Shard Architecture: Starts With S and Ends With Hard
The Etsy Shard Architecture: Starts With S and Ends With Hardjgoulah
 
Increment letter format
Increment letter formatIncrement letter format
Increment letter formatDeepti Joshi
 
Downtown & Infill Tax Increment Districts
Downtown & Infill Tax Increment DistrictsDowntown & Infill Tax Increment Districts
Downtown & Infill Tax Increment DistrictsVierbicher
 
Increment Strategy ppt 2012-13 : Play this in slide show mode
Increment Strategy ppt 2012-13 : Play this in slide show modeIncrement Strategy ppt 2012-13 : Play this in slide show mode
Increment Strategy ppt 2012-13 : Play this in slide show modeVipul Saxena
 
Lecture 8 increment_and_decrement_operators
Lecture 8 increment_and_decrement_operatorsLecture 8 increment_and_decrement_operators
Lecture 8 increment_and_decrement_operatorseShikshak
 
Scrum - Agile Methodology
Scrum - Agile MethodologyScrum - Agile Methodology
Scrum - Agile MethodologyNiel Deckx
 
Iocl compensation
Iocl compensationIocl compensation
Iocl compensationmukti91
 
Normal forest – growing stock and increment
Normal forest – growing stock and incrementNormal forest – growing stock and increment
Normal forest – growing stock and incrementiqbalforestry
 
An overview of techniques for detecting software variability concepts in sour...
An overview of techniques for detecting software variability concepts in sour...An overview of techniques for detecting software variability concepts in sour...
An overview of techniques for detecting software variability concepts in sour...Angela Lozano
 
C Prog. - Operators and Expressions
C Prog. - Operators and ExpressionsC Prog. - Operators and Expressions
C Prog. - Operators and Expressionsvinay arora
 

Viewers also liked (20)

Polymer & the web components revolution 6:25:14
Polymer & the web components revolution 6:25:14Polymer & the web components revolution 6:25:14
Polymer & the web components revolution 6:25:14
 
Agile Development
Agile DevelopmentAgile Development
Agile Development
 
Downtown & Infill Tax Increment Districts: Strategies for Success
Downtown & Infill Tax Increment Districts: Strategies for SuccessDowntown & Infill Tax Increment Districts: Strategies for Success
Downtown & Infill Tax Increment Districts: Strategies for Success
 
Appraisal and Performance Management in Schools - A practical approach
Appraisal and Performance Management in Schools - A practical approachAppraisal and Performance Management in Schools - A practical approach
Appraisal and Performance Management in Schools - A practical approach
 
The Economics of Green Building
The Economics of Green BuildingThe Economics of Green Building
The Economics of Green Building
 
The Etsy Shard Architecture: Starts With S and Ends With Hard
The Etsy Shard Architecture: Starts With S and Ends With HardThe Etsy Shard Architecture: Starts With S and Ends With Hard
The Etsy Shard Architecture: Starts With S and Ends With Hard
 
Increment letter format
Increment letter formatIncrement letter format
Increment letter format
 
Downtown & Infill Tax Increment Districts
Downtown & Infill Tax Increment DistrictsDowntown & Infill Tax Increment Districts
Downtown & Infill Tax Increment Districts
 
Increment Strategy ppt 2012-13 : Play this in slide show mode
Increment Strategy ppt 2012-13 : Play this in slide show modeIncrement Strategy ppt 2012-13 : Play this in slide show mode
Increment Strategy ppt 2012-13 : Play this in slide show mode
 
Lecture 8 increment_and_decrement_operators
Lecture 8 increment_and_decrement_operatorsLecture 8 increment_and_decrement_operators
Lecture 8 increment_and_decrement_operators
 
String
StringString
String
 
Scrum - Agile Methodology
Scrum - Agile MethodologyScrum - Agile Methodology
Scrum - Agile Methodology
 
Iocl compensation
Iocl compensationIocl compensation
Iocl compensation
 
Incremental
IncrementalIncremental
Incremental
 
Intro To Scrum.V3
Intro To Scrum.V3Intro To Scrum.V3
Intro To Scrum.V3
 
Normal forest – growing stock and increment
Normal forest – growing stock and incrementNormal forest – growing stock and increment
Normal forest – growing stock and increment
 
Introduction to Redux
Introduction to ReduxIntroduction to Redux
Introduction to Redux
 
An overview of techniques for detecting software variability concepts in sour...
An overview of techniques for detecting software variability concepts in sour...An overview of techniques for detecting software variability concepts in sour...
An overview of techniques for detecting software variability concepts in sour...
 
C Prog. - Operators and Expressions
C Prog. - Operators and ExpressionsC Prog. - Operators and Expressions
C Prog. - Operators and Expressions
 
Kerala Service Rules-Part 1
Kerala Service Rules-Part 1Kerala Service Rules-Part 1
Kerala Service Rules-Part 1
 

Similar to Conflict Resolution In Kai

Gridify your Spring application with Grid Gain @ Spring Italian Meeting 2008
Gridify your Spring application with Grid Gain @ Spring Italian Meeting 2008Gridify your Spring application with Grid Gain @ Spring Italian Meeting 2008
Gridify your Spring application with Grid Gain @ Spring Italian Meeting 2008Sergio Bossa
 
Kamaelia-ACCU-20050422
Kamaelia-ACCU-20050422Kamaelia-ACCU-20050422
Kamaelia-ACCU-20050422journeyer
 
Make Your Life Easier With Maatkit
Make Your Life Easier With MaatkitMake Your Life Easier With Maatkit
Make Your Life Easier With MaatkitMySQLConference
 
Lightweight Grids With Terracotta
Lightweight Grids With TerracottaLightweight Grids With Terracotta
Lightweight Grids With TerracottaPT.JUG
 
How Prometheus Store the Data
How Prometheus Store the DataHow Prometheus Store the Data
How Prometheus Store the DataHao Chen
 
Kamaelia - Networking Using Generators
Kamaelia - Networking Using GeneratorsKamaelia - Networking Using Generators
Kamaelia - Networking Using Generatorskamaelian
 
Garbage collectors and Memory Leaks in Nodejs - V8
Garbage collectors and Memory Leaks in Nodejs - V8Garbage collectors and Memory Leaks in Nodejs - V8
Garbage collectors and Memory Leaks in Nodejs - V8Thien Ly
 
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016Zabbix
 
Multi Core Playground
Multi Core PlaygroundMulti Core Playground
Multi Core PlaygroundESUG
 
PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...Jinwon Lee
 
Scale Out Your Graph Across Servers and Clouds with OrientDB
Scale Out Your Graph Across Servers and Clouds  with OrientDBScale Out Your Graph Across Servers and Clouds  with OrientDB
Scale Out Your Graph Across Servers and Clouds with OrientDBLuca Garulli
 
Running your Java EE 6 Applications in the Cloud
Running your Java EE 6 Applications in the CloudRunning your Java EE 6 Applications in the Cloud
Running your Java EE 6 Applications in the CloudArun Gupta
 
Porting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to RustPorting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to RustEvan Chan
 

Similar to Conflict Resolution In Kai (20)

Gridify your Spring application with Grid Gain @ Spring Italian Meeting 2008
Gridify your Spring application with Grid Gain @ Spring Italian Meeting 2008Gridify your Spring application with Grid Gain @ Spring Italian Meeting 2008
Gridify your Spring application with Grid Gain @ Spring Italian Meeting 2008
 
Os Wilhelm
Os WilhelmOs Wilhelm
Os Wilhelm
 
Cloud Probing
Cloud ProbingCloud Probing
Cloud Probing
 
Kamaelia-ACCU-20050422
Kamaelia-ACCU-20050422Kamaelia-ACCU-20050422
Kamaelia-ACCU-20050422
 
Make Your Life Easier With Maatkit
Make Your Life Easier With MaatkitMake Your Life Easier With Maatkit
Make Your Life Easier With Maatkit
 
Lightweight Grids With Terracotta
Lightweight Grids With TerracottaLightweight Grids With Terracotta
Lightweight Grids With Terracotta
 
How Prometheus Store the Data
How Prometheus Store the DataHow Prometheus Store the Data
How Prometheus Store the Data
 
MySQL Proxy
MySQL ProxyMySQL Proxy
MySQL Proxy
 
Kamaelia - Networking Using Generators
Kamaelia - Networking Using GeneratorsKamaelia - Networking Using Generators
Kamaelia - Networking Using Generators
 
Garbage collectors and Memory Leaks in Nodejs - V8
Garbage collectors and Memory Leaks in Nodejs - V8Garbage collectors and Memory Leaks in Nodejs - V8
Garbage collectors and Memory Leaks in Nodejs - V8
 
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016
 
Multi Core Playground
Multi Core PlaygroundMulti Core Playground
Multi Core Playground
 
PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...
 
Time for Comet?
Time for Comet?Time for Comet?
Time for Comet?
 
Scale Out Your Graph Across Servers and Clouds with OrientDB
Scale Out Your Graph Across Servers and Clouds  with OrientDBScale Out Your Graph Across Servers and Clouds  with OrientDB
Scale Out Your Graph Across Servers and Clouds with OrientDB
 
Running your Java EE 6 Applications in the Cloud
Running your Java EE 6 Applications in the CloudRunning your Java EE 6 Applications in the Cloud
Running your Java EE 6 Applications in the Cloud
 
Porting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to RustPorting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to Rust
 
Memcached Presentation
Memcached PresentationMemcached Presentation
Memcached Presentation
 
XS Boston 2008 Network Topology
XS Boston 2008 Network TopologyXS Boston 2008 Network Topology
XS Boston 2008 Network Topology
 
XS Oracle 2009 Just Run It
XS Oracle 2009 Just Run ItXS Oracle 2009 Just Run It
XS Oracle 2009 Just Run It
 

Recently uploaded

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 

Recently uploaded (20)

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 

Conflict Resolution In Kai

  • 1. Conflict resolution in Kai Shun'ichi Shinohara (shino) @ Erlang/distributed workshop #2 2008-11-20 (Thu), KLab, Tokyo Copyright by shino http://creativecommons.org/licenses/by-sa/2.1/jp/deed.en_US
  • 2. Goals ☞ Learn that events are NOT total ordered, but can be “concurrent” in a distributed system. ☞ Understand some points on conflict resolution in Amazon's Dynamo. ☞ Discuss on Kai's designs and implementations concerning conflict resolution. 2
  • 3. Contents (1/2) I.Lamport's logical clock ● “happens-before” and “concurrent”. ● Construction and features. ● Gleaning. II.Vector Clocks ● Construction. ● keeping slender (practice). 3
  • 4. Contents (2/2) III.Some points on Amazon's Dynamo ● Coordinators increment vector clocks. ● Client-side conflict resolution. ● Multiple versions in a single node. IV.Discussions on Kai's conflict resolution ● Packing checksums to “cas_unique” (lossy). ● How/Whether to merge a set of vector clocks. ● Consistency in putting data. 4
  • 5. I. Lamport's logical clock ● Idea ● Two events are NOT ALWAYS ordered in a distributed system. ● Orders for causally-related events are only meaningful. ● Inspired by A. Einstein's special relativity. 5
  • 6. Causality in Special Relativity ● Three types of regions ● Past, ● Future, ● Simultaneous. ● Simultaneous is ● not a plane, ● but a region. http://en.wikipedia.org/wiki/Light_cone 6
  • 7. Problem setting ● Processes communicate by passing messages. ● Messages <==> causal relationships. http://commons.wikimedia.org/wiki/Image:Lamport-Clock-en.svg 7
  • 8. “happens-before” (->) ● In a single process, -> means the “normal” order, e.g. A1 -> A2. ● E1/E2 are sending/receipt of the same message, then “E1 -> E2”, e.g. A2 -> B4. ● “E1 -> E2 and E2 -> E3” then “E1 -> E3”, e.g. A1 → B4. 8
  • 9. “concurrent” (||) ● “not (A -> B)” and “not (B -> A)”. ● e.g. A2 and B3, B4 and C2. 9
  • 10. Construction ● Increment when any event occurs, and ● Message receivers increment their clock to ● max(present clock, sender's clock) + 1. http://commons.wikimedia.org/wiki/Image:Lamport-Clock-en.svg 10
  • 11. Features ● When A -> B, C(A) < C(B). ---(*) ● Clock order is consistent with causality. ● But: C(A) < C(B) does not imply A -> B. ● Clock order does NOT determine a unique causality-relation. ● Just: C(A) <= C(B) implies “not (B -> A)” ● The contraposition of (*). ● “not (B -> A)” is equivalent “A -> B or A || B”. 11
  • 12. Gleaning ● Extension to total ordering. ● Label an event as (process id, logical clock). ● An (arbitrary) total order in {process id}. ● (P(A), C(A)) < (P(B), C(B)) if and only if ● C(A) < C(B), or ● C(A) = C(B) and P(A) < P(B). ● Example: mutex algorithm ● Non centralized, but all nodes should participate. ● Consult the paper for details :-) 12
  • 13. II. Vector Clocks ● Vector clocks provides the clock s.t. ● VC(A) < VC(B) then A -> B, ● VC(A) || VC(B) then A || B, ● Where VC(A) is the vector clock of an event A. ● It determines causality relation, not only “happens-before”, but also “concurrent” for events. ● In practice: keeping them slender. 13
  • 14. Construction ● A vector clock: a set of {process id, clock}. ● Increment own clock and merge (as follows). http://commons.wikimedia.org/wiki/Image:Vector_Clock.svg 14
  • 15. Keeping slender (practice) ● Element size can be bloated. ● As a size = #(related nodes) grows, ● Comparison slows down: not so bad (usually), ● Data consumes space (memory/storage): bad, ● Messages become fat: TOO BAD! ● Related nodes should be as few as possible. 15
  • 16. III. Some points from Amazon's Dynamo ☞ Coordinators increment vector clocks. ☞ Client-side conflict resolution. ☞ Multiple versions in a single node. ● This part includes my guess. 16
  • 17. Coordinators increment vector clocks ● In section 4.3 in the Dynamo paper. ● 4.3. Replication ● “Each key, k, is assigned to a coordinator node [...]. The coordinator is in charge of the replication of the data items that fall within its range.” ● Without failures, every vector clock has just one element. ● Suppress vector clocks to be fat. 17
  • 18. Client-side conflict resolution ● If concurrent versions exist in server-side, ● A Client receives multiple data, each with vector clocks, ● The client merge ● Not only data (e.g. for hash-type data, just “merge” them), ● But vector clocks. ● Servers don't know data schema. ● Both servers and clients know vector clock manipulation. 18
  • 19. Multiple versions in a single node ● In section 6 and 6.4 ● 6. Experiences & Lessons Learned ● “All the measurements presented in this section were taken on a live system operating with a configuration of (3,2,2)” ● 6.3 Divergent Versions: When and How Many? ● the number of versions returned [..] 99.94% of requests saw exactly one version; [..] and 0.00009% of requests saw 4 versions. ● A single node stores multiple conflicting data possibly. 19
  • 20. IV. Discussions on Kai's implementation ☞ Packing checksums to “cas_unique” ☞ How/Whether to merge a set of vector clocks ☞ Consistency in putting data 20
  • 21. Packing checksums to “cas_unique” ● Kai uses memcached protocol between clients and servers ● Memcached's “gets” and “cas” commands uses “cas_unique” field (uint64) for optimistic lock ● Mapping of Dynamo operations to memcached command: ● GET => gets, ● PUT => cas. 21
  • 22. Packing checksums to “cas_unique” ● Packing one checksum (no concurrent data). ● <<1:4, [data's checksum]:60>> ● Packing two checksums. ● <<2:4, [data1's checksum]:30, [data2's checksum]:30>> ● General cases are ● First 4 bit: #(data), ● [Each data's checksum]:(60/#(data)), ● Zero padding rest. ● Can ONLY check consistency (lossy). 22
  • 23. How/Whether to merge a set of vector clocks ● Clients don't merge vector clocks in Kai ● Who merges them??? ● kai_coordinator can merge the clocks in cas, ● At the cost of some message delay. ● Receive “cas” command, ● Collect data from replica nodes (just as “get”), ● Check consistency, cas_unique <==> vector clocks, ● Merge vector clocks, ● Store data in each replica node. ● Spoils the advantage of (N,R,W) flexibility. 23
  • 24. Alternatives (1/2) ● Break the memcached protocol in ● “The server will transmit back unstructured data in exactly the same way it received it” (from the memcached protocol). ● Sketch of implementation: ● Merge related vector clocks, ● Serialize the merged vector clock, ● Use cas_unique to represents its length ● Concatenate the serialized vector clock and the user data and send in a data block. 24
  • 25. Alternatives (2/2) ● Server side resolution, ● e.g. a user can add-on a module of resolution logic. ● Other protocols, ● e.g. HTTP, XMPP(i don't know much). ● Memcached's client libraries by C and various language wrappers are appealing. ● Nice idea? 25
  • 26. Consistency in putting data ● Problem ● Operations go on multiple nodes. ● How to guarantee the consistency between updating vector clocks and storing data? ● “Multiple data in a single node” can help? ● Appropriate to Dynamo's “always writable” policy. 26
  • 27. Summary ☞ Causality in a distributed system has only partial order. ☞ Vector clocks is easy to implement/use, but should pay attention to keep them slender. ☞ “Multiple versions in single node” seems nice. ● Please discuss anything later and at MLs. 27
  • 28. References ● Lamport's logical clock ● http://research.microsoft.com/users/lamport/pubs/pubs.html#time-clocks ● http://www.cs.tsukuba.ac.jp/~yas/sie/csys-2007/2008-02-29/ ● Vector clocks ● http://www.vs.inf.ethz.ch/publ/papers/VirtTimeGlobStates.pdf ● http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.47.7435 ● http://www.slideshare.net/takemaru/kai-an-open-source-implementation-of-amazons-dynamo-472179 ● Dynamo ● http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html ● Kai ● http://kai.wiki.sourceforge.net/ ● Memcached protocol ● http://code.sixapart.com/svn/memcached/trunk/server/doc/protocol.txt 28