SlideShare a Scribd company logo
1 of 24
Download to read offline
Distributed Systems Theory
for Mere Mortals
Ensar Basri Kahveci
1
Java Day Istanbul
May 2017
I am Ensar Basri Kahveci
Hello!
Distributed Systems Engineer
twitter & github
website
@ Hazelcast
metanet
basrikahveci.com
2
Hazelcast
◉ Leading open source Java IMDG
◉ Distributed Java collections, JCache, HD store, ….
◉ Distributed computation and messaging
◉ Embedded or client-server deployment
◉ Integration modules & cloud friendly
3
Hazelcast as a Distributed System
◉ Scale up & scale out
◉ Distributed from day one
◉ Dynamic clustering and elasticity
◉ Data partitioning and replication
◉ Fault tolerance
4
◉ Collection of entities trying to solve a common
problem
○ Communication via passing messages
○ Uncertain and partial knowledge
◉ We need distributed systems mostly because:
○ Scalability
○ Fault tolerance
Distributed Systems
5
◉ Independent failures
◉ Non-negligible message transmission delays
◉ Unreliable communication
Main Difficulties
6
◉ System models come into play.
○ Interaction models
○ Failure modes
○ Notion of time
◉ Consensus Problem
◉ CAP Principle
Systems Models
7
◉ Synchronous
◉ Partially synchronous
◉ Asynchronous
Hazelcast embraces the partially-synchronous model.
OperationTimeoutException
Interaction Models
8
Failure modes
◉ Crash-stop
◉ Omission faults
◉ Crash-recover
◉ Arbitrary failures (Byzantine)
Hazelcast handles crash-recover failures by
making them look like crash-stop failures.
9
Time & Order: Physical time
◉ Time is often used to order events in distributed
algorithms.
◉ Physical timestamps
○ LatestUpdateMapMergePolicy in Hazelcast
◉ Clock drifts can break latest update wins
◉ Google TrueTime
10
Time & Order: Logical Clocks
◉ Logical clocks (Lamport clocks)
○ Local counters and communication
◉ Defines happens-before relationship.
○ (i.e., causality)
Hazelcast extensively uses it along with
primary-copy replication.
11
Time & Order: Vector Clocks
◉ Inferring causality by comparing timestamps.
◉ Vector clocks are used to infer causality.
○ Dynamo-style databases use them to detect conflicts.
Lamport clocks work fine for Hazelcast because
there is only a single node performing the updates.
12
Consensus
◉ The problem of having a set of processes
agree on a value.
○ Leader election, state machine replication, strong
consistency, distributed transactions, …
◉ Safety
◉ Liveness
13
FLP Result
◉ In the asynchronous model, distributed consensus
may not be solved within bounded time if at least
one process can fail with crash-stop.
◉ It is because we cannot differentiate between a
crashed process or a slow process.
14
Unreliable Failure Detectors
◉ Local failure detectors which rely on timeouts and
can make mistakes.
◉ Two types of mistakes:
○ suspecting from a running process ⇒ ACCURACY
○ not suspecting from a failed process ⇒ COMPLETENESS
◉ Different types of failure detectors.
15
Two-Phase Commit (2PC)
◉ 2PC preserves safety, but it can lose liveness with
crash-stop failures.
16
Three-Phase Commit (3PC)
◉ 3PC tolerates crash-stop failures and preserves
liveness, but can lose safety with network partitions
or crash-recover failures.
17
Majority Based Consensus
◉ Availability of majority is needed for liveness
and safety.
○ 2f + 1 nodes tolerate failure of f nodes.
◉ Resiliency to crash-stop, network partitions
and crash-recover.
◉ Paxos, Zab, Raft, Viewstamped Replication
18
Consensus: Recap
We have 2PC and 3PC in Hazelcast, but not
majority based consensus algorithms.
◉ Consensus systems are mainly used for
achieving strong consistency.
19
CAP Principle
◉ Proposed by Eric Brewer in 2000.
○ Formally proved in 2002.
◉ A shared-data system cannot achieve perfect
consistency and perfect availability in the
presence of network partitions.
○ AP versus CP
◉ Widespread acceptance, and yet a lot of criticism.
20
Consistency and Availability
◉ Levels of consistency:
Data-centric (CP)
Client-centric (AP)
◉ Levels of availability:
High availability
Sticky availability
21
Hazelcast & CAP Principle
◉ Hazelcast is AP with primary-copy & async
replication.
primary-copy strong consistency
on a stable cluster
sticky availability
async replication high throughput possibility of losing
consistency on failures
22
No Hocus Pocus
◉ A lot of variations in the abstractions and models.
◉ Learn the fundamentals, the rest will change
anyway.
23
Any questions ?
Thanks!
24

More Related Content

What's hot

An Overview of Distributed Debugging
An Overview of Distributed DebuggingAn Overview of Distributed Debugging
An Overview of Distributed DebuggingAnant Narayanan
 
CAP theorem and distributed systems
CAP theorem and distributed systemsCAP theorem and distributed systems
CAP theorem and distributed systemsKlika Tech, Inc
 
All you didn't know about the CAP theorem
All you didn't know about the CAP theoremAll you didn't know about the CAP theorem
All you didn't know about the CAP theoremKanstantsin Hontarau
 
Client Centric Consistency Model
Client Centric Consistency ModelClient Centric Consistency Model
Client Centric Consistency ModelRajat Kumar
 
Microservices 101: Exploiting Reality's Constraints with Technology
Microservices 101: Exploiting Reality's Constraints with TechnologyMicroservices 101: Exploiting Reality's Constraints with Technology
Microservices 101: Exploiting Reality's Constraints with TechnologyLegacy Typesafe (now Lightbend)
 
Welcome to the nightmare of locking, blocking and isolation levels!
Welcome to the nightmare of locking, blocking and isolation levels!Welcome to the nightmare of locking, blocking and isolation levels!
Welcome to the nightmare of locking, blocking and isolation levels!Boris Hristov
 
The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!Boris Hristov
 
Welcome to the nightmare of locking, blocking and isolation levels!
Welcome to the nightmare of locking, blocking and isolation levels!Welcome to the nightmare of locking, blocking and isolation levels!
Welcome to the nightmare of locking, blocking and isolation levels!Boris Hristov
 
The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!Boris Hristov
 
CAP Theorem and Split Brain Syndrome
CAP Theorem and Split Brain SyndromeCAP Theorem and Split Brain Syndrome
CAP Theorem and Split Brain SyndromeDilum Bandara
 
The nightmare of locking, blocking and isolation levels!
The nightmare of locking, blocking and isolation levels!The nightmare of locking, blocking and isolation levels!
The nightmare of locking, blocking and isolation levels!Boris Hristov
 
Replay your workload as it is your actual one!
Replay your workload as it is your actual one! Replay your workload as it is your actual one!
Replay your workload as it is your actual one! Boris Hristov
 
CurveZMQ, ZMTP and other Dubious Characters
CurveZMQ, ZMTP and other Dubious CharactersCurveZMQ, ZMTP and other Dubious Characters
CurveZMQ, ZMTP and other Dubious Characterspieterh
 
How to cook Rabbit on Production - Serhiy Nazarov | Ruby Meditation 28
How to cook Rabbit on Production - Serhiy Nazarov | Ruby Meditation 28How to cook Rabbit on Production - Serhiy Nazarov | Ruby Meditation 28
How to cook Rabbit on Production - Serhiy Nazarov | Ruby Meditation 28Ruby Meditation
 
The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!Boris Hristov
 
The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!Boris Hristov
 
Consistency protocols
Consistency protocolsConsistency protocols
Consistency protocolsZongYing Lyu
 
The nightmare of locking, blocking and isolation levels
The nightmare of locking, blocking and isolation levelsThe nightmare of locking, blocking and isolation levels
The nightmare of locking, blocking and isolation levelsBoris Hristov
 
The nightmare of locking, blocking and deadlocking. SQLSaturday #257, Verona
The nightmare of locking, blocking and deadlocking. SQLSaturday #257, VeronaThe nightmare of locking, blocking and deadlocking. SQLSaturday #257, Verona
The nightmare of locking, blocking and deadlocking. SQLSaturday #257, VeronaBoris Hristov
 
Multiprocessing -Interprocessing communication and process sunchronization,se...
Multiprocessing -Interprocessing communication and process sunchronization,se...Multiprocessing -Interprocessing communication and process sunchronization,se...
Multiprocessing -Interprocessing communication and process sunchronization,se...Neena R Krishna
 

What's hot (20)

An Overview of Distributed Debugging
An Overview of Distributed DebuggingAn Overview of Distributed Debugging
An Overview of Distributed Debugging
 
CAP theorem and distributed systems
CAP theorem and distributed systemsCAP theorem and distributed systems
CAP theorem and distributed systems
 
All you didn't know about the CAP theorem
All you didn't know about the CAP theoremAll you didn't know about the CAP theorem
All you didn't know about the CAP theorem
 
Client Centric Consistency Model
Client Centric Consistency ModelClient Centric Consistency Model
Client Centric Consistency Model
 
Microservices 101: Exploiting Reality's Constraints with Technology
Microservices 101: Exploiting Reality's Constraints with TechnologyMicroservices 101: Exploiting Reality's Constraints with Technology
Microservices 101: Exploiting Reality's Constraints with Technology
 
Welcome to the nightmare of locking, blocking and isolation levels!
Welcome to the nightmare of locking, blocking and isolation levels!Welcome to the nightmare of locking, blocking and isolation levels!
Welcome to the nightmare of locking, blocking and isolation levels!
 
The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!
 
Welcome to the nightmare of locking, blocking and isolation levels!
Welcome to the nightmare of locking, blocking and isolation levels!Welcome to the nightmare of locking, blocking and isolation levels!
Welcome to the nightmare of locking, blocking and isolation levels!
 
The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!
 
CAP Theorem and Split Brain Syndrome
CAP Theorem and Split Brain SyndromeCAP Theorem and Split Brain Syndrome
CAP Theorem and Split Brain Syndrome
 
The nightmare of locking, blocking and isolation levels!
The nightmare of locking, blocking and isolation levels!The nightmare of locking, blocking and isolation levels!
The nightmare of locking, blocking and isolation levels!
 
Replay your workload as it is your actual one!
Replay your workload as it is your actual one! Replay your workload as it is your actual one!
Replay your workload as it is your actual one!
 
CurveZMQ, ZMTP and other Dubious Characters
CurveZMQ, ZMTP and other Dubious CharactersCurveZMQ, ZMTP and other Dubious Characters
CurveZMQ, ZMTP and other Dubious Characters
 
How to cook Rabbit on Production - Serhiy Nazarov | Ruby Meditation 28
How to cook Rabbit on Production - Serhiy Nazarov | Ruby Meditation 28How to cook Rabbit on Production - Serhiy Nazarov | Ruby Meditation 28
How to cook Rabbit on Production - Serhiy Nazarov | Ruby Meditation 28
 
The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!
 
The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!
 
Consistency protocols
Consistency protocolsConsistency protocols
Consistency protocols
 
The nightmare of locking, blocking and isolation levels
The nightmare of locking, blocking and isolation levelsThe nightmare of locking, blocking and isolation levels
The nightmare of locking, blocking and isolation levels
 
The nightmare of locking, blocking and deadlocking. SQLSaturday #257, Verona
The nightmare of locking, blocking and deadlocking. SQLSaturday #257, VeronaThe nightmare of locking, blocking and deadlocking. SQLSaturday #257, Verona
The nightmare of locking, blocking and deadlocking. SQLSaturday #257, Verona
 
Multiprocessing -Interprocessing communication and process sunchronization,se...
Multiprocessing -Interprocessing communication and process sunchronization,se...Multiprocessing -Interprocessing communication and process sunchronization,se...
Multiprocessing -Interprocessing communication and process sunchronization,se...
 

Similar to Distributed Systems Theory for Mere Mortals - Java Day Istanbul May 2017

Sistemas Distribuidos
Sistemas DistribuidosSistemas Distribuidos
Sistemas DistribuidosLocaweb
 
Concurrent Programming in Java
Concurrent Programming in JavaConcurrent Programming in Java
Concurrent Programming in JavaLakshmi Narasimhan
 
Fault tolerance - look, it's simple!
Fault tolerance - look, it's simple!Fault tolerance - look, it's simple!
Fault tolerance - look, it's simple!Izzet Mustafaiev
 
Storing the real world data
Storing the real world dataStoring the real world data
Storing the real world dataAthira Mukundan
 
Reactive by example (DevOpsDaysTLV 2019)
Reactive by example (DevOpsDaysTLV 2019)Reactive by example (DevOpsDaysTLV 2019)
Reactive by example (DevOpsDaysTLV 2019)Eran Harel
 
The Supporting Role of Antivirus Evasion while Persisting
The Supporting Role of Antivirus Evasion while PersistingThe Supporting Role of Antivirus Evasion while Persisting
The Supporting Role of Antivirus Evasion while PersistingCTruncer
 
From AP to CP and Back: The Curious Case of Hazelcast (jdk.io 2018)
From AP to CP and Back: The Curious Case of Hazelcast (jdk.io 2018)From AP to CP and Back: The Curious Case of Hazelcast (jdk.io 2018)
From AP to CP and Back: The Curious Case of Hazelcast (jdk.io 2018)Ensar Basri Kahveci
 
Designing large scale distributed systems
Designing large scale distributed systemsDesigning large scale distributed systems
Designing large scale distributed systemsAshwani Priyedarshi
 
DevOps Days Vancouver 2014 Slides
DevOps Days Vancouver 2014 SlidesDevOps Days Vancouver 2014 Slides
DevOps Days Vancouver 2014 SlidesAlex Cruise
 
Concurrent/ parallel programming
Concurrent/ parallel programmingConcurrent/ parallel programming
Concurrent/ parallel programmingTausun Akhtary
 
Fault tolerance review by tsegabrehan zerihun
Fault tolerance review by tsegabrehan zerihunFault tolerance review by tsegabrehan zerihun
Fault tolerance review by tsegabrehan zerihunTsegabrehan Am
 
Stability Patterns for Microservices
Stability Patterns for MicroservicesStability Patterns for Microservices
Stability Patterns for Microservicespflueras
 
From Mainframe to Microservice: An Introduction to Distributed Systems
From Mainframe to Microservice: An Introduction to Distributed SystemsFrom Mainframe to Microservice: An Introduction to Distributed Systems
From Mainframe to Microservice: An Introduction to Distributed SystemsTyler Treat
 
CAP Theorem - Theory, Implications and Practices
CAP Theorem - Theory, Implications and PracticesCAP Theorem - Theory, Implications and Practices
CAP Theorem - Theory, Implications and PracticesYoav Francis
 
lecture 8 b main memory
lecture 8 b main memorylecture 8 b main memory
lecture 8 b main memoryITNet
 
Fault tolerant presentation
Fault tolerant presentationFault tolerant presentation
Fault tolerant presentationskadyan1
 

Similar to Distributed Systems Theory for Mere Mortals - Java Day Istanbul May 2017 (20)

Sistemas Distribuidos
Sistemas DistribuidosSistemas Distribuidos
Sistemas Distribuidos
 
Concurrency in Java
Concurrency in JavaConcurrency in Java
Concurrency in Java
 
Concurrent Programming in Java
Concurrent Programming in JavaConcurrent Programming in Java
Concurrent Programming in Java
 
Fault tolerance - look, it's simple!
Fault tolerance - look, it's simple!Fault tolerance - look, it's simple!
Fault tolerance - look, it's simple!
 
Storing the real world data
Storing the real world dataStoring the real world data
Storing the real world data
 
Reactive by example (DevOpsDaysTLV 2019)
Reactive by example (DevOpsDaysTLV 2019)Reactive by example (DevOpsDaysTLV 2019)
Reactive by example (DevOpsDaysTLV 2019)
 
The Supporting Role of Antivirus Evasion while Persisting
The Supporting Role of Antivirus Evasion while PersistingThe Supporting Role of Antivirus Evasion while Persisting
The Supporting Role of Antivirus Evasion while Persisting
 
From AP to CP and Back: The Curious Case of Hazelcast (jdk.io 2018)
From AP to CP and Back: The Curious Case of Hazelcast (jdk.io 2018)From AP to CP and Back: The Curious Case of Hazelcast (jdk.io 2018)
From AP to CP and Back: The Curious Case of Hazelcast (jdk.io 2018)
 
Designing large scale distributed systems
Designing large scale distributed systemsDesigning large scale distributed systems
Designing large scale distributed systems
 
DevOps Days Vancouver 2014 Slides
DevOps Days Vancouver 2014 SlidesDevOps Days Vancouver 2014 Slides
DevOps Days Vancouver 2014 Slides
 
Concurrent/ parallel programming
Concurrent/ parallel programmingConcurrent/ parallel programming
Concurrent/ parallel programming
 
Fault tolerance review by tsegabrehan zerihun
Fault tolerance review by tsegabrehan zerihunFault tolerance review by tsegabrehan zerihun
Fault tolerance review by tsegabrehan zerihun
 
Stability Patterns for Microservices
Stability Patterns for MicroservicesStability Patterns for Microservices
Stability Patterns for Microservices
 
Advanced python
Advanced pythonAdvanced python
Advanced python
 
From Mainframe to Microservice: An Introduction to Distributed Systems
From Mainframe to Microservice: An Introduction to Distributed SystemsFrom Mainframe to Microservice: An Introduction to Distributed Systems
From Mainframe to Microservice: An Introduction to Distributed Systems
 
CAP Theorem - Theory, Implications and Practices
CAP Theorem - Theory, Implications and PracticesCAP Theorem - Theory, Implications and Practices
CAP Theorem - Theory, Implications and Practices
 
CQRS: Theory
CQRS: Theory CQRS: Theory
CQRS: Theory
 
The Google file system
The Google file systemThe Google file system
The Google file system
 
lecture 8 b main memory
lecture 8 b main memorylecture 8 b main memory
lecture 8 b main memory
 
Fault tolerant presentation
Fault tolerant presentationFault tolerant presentation
Fault tolerant presentation
 

More from Ensar Basri Kahveci

java.util.concurrent for Distributed Coordination - Berlin Expert Days 2019
java.util.concurrent for Distributed Coordination - Berlin Expert Days 2019java.util.concurrent for Distributed Coordination - Berlin Expert Days 2019
java.util.concurrent for Distributed Coordination - Berlin Expert Days 2019Ensar Basri Kahveci
 
java.util.concurrent for Distributed Coordination, Riga DevDays 2019
java.util.concurrent for Distributed Coordination, Riga DevDays 2019java.util.concurrent for Distributed Coordination, Riga DevDays 2019
java.util.concurrent for Distributed Coordination, Riga DevDays 2019Ensar Basri Kahveci
 
java.util.concurrent for Distributed Coordination, GeeCON Krakow 2019
java.util.concurrent for Distributed Coordination, GeeCON Krakow 2019java.util.concurrent for Distributed Coordination, GeeCON Krakow 2019
java.util.concurrent for Distributed Coordination, GeeCON Krakow 2019Ensar Basri Kahveci
 
java.util.concurrent for Distributed Coordination, JEEConf 2019
java.util.concurrent for Distributed Coordination, JEEConf 2019java.util.concurrent for Distributed Coordination, JEEConf 2019
java.util.concurrent for Distributed Coordination, JEEConf 2019Ensar Basri Kahveci
 
Replication Distilled: Hazelcast Deep Dive @ In-Memory Computing Summit San F...
Replication Distilled: Hazelcast Deep Dive @ In-Memory Computing Summit San F...Replication Distilled: Hazelcast Deep Dive @ In-Memory Computing Summit San F...
Replication Distilled: Hazelcast Deep Dive @ In-Memory Computing Summit San F...Ensar Basri Kahveci
 
Replication Distilled: Hazelcast Deep Dive - Berlin Expert Days 2018
Replication Distilled: Hazelcast Deep Dive - Berlin Expert Days 2018Replication Distilled: Hazelcast Deep Dive - Berlin Expert Days 2018
Replication Distilled: Hazelcast Deep Dive - Berlin Expert Days 2018Ensar Basri Kahveci
 
Distributed Systems Theory for Mere Mortals - Software Craftsmanship Turkey
Distributed Systems Theory for Mere Mortals - Software Craftsmanship TurkeyDistributed Systems Theory for Mere Mortals - Software Craftsmanship Turkey
Distributed Systems Theory for Mere Mortals - Software Craftsmanship TurkeyEnsar Basri Kahveci
 
Distributed Systems Theory for Mere Mortals - Topconf Dusseldorf October 2017
Distributed Systems Theory for Mere Mortals - Topconf Dusseldorf October 2017Distributed Systems Theory for Mere Mortals - Topconf Dusseldorf October 2017
Distributed Systems Theory for Mere Mortals - Topconf Dusseldorf October 2017Ensar Basri Kahveci
 
Ankara Jug - Practical Functional Programming with Scala
Ankara Jug - Practical Functional Programming with ScalaAnkara Jug - Practical Functional Programming with Scala
Ankara Jug - Practical Functional Programming with ScalaEnsar Basri Kahveci
 

More from Ensar Basri Kahveci (9)

java.util.concurrent for Distributed Coordination - Berlin Expert Days 2019
java.util.concurrent for Distributed Coordination - Berlin Expert Days 2019java.util.concurrent for Distributed Coordination - Berlin Expert Days 2019
java.util.concurrent for Distributed Coordination - Berlin Expert Days 2019
 
java.util.concurrent for Distributed Coordination, Riga DevDays 2019
java.util.concurrent for Distributed Coordination, Riga DevDays 2019java.util.concurrent for Distributed Coordination, Riga DevDays 2019
java.util.concurrent for Distributed Coordination, Riga DevDays 2019
 
java.util.concurrent for Distributed Coordination, GeeCON Krakow 2019
java.util.concurrent for Distributed Coordination, GeeCON Krakow 2019java.util.concurrent for Distributed Coordination, GeeCON Krakow 2019
java.util.concurrent for Distributed Coordination, GeeCON Krakow 2019
 
java.util.concurrent for Distributed Coordination, JEEConf 2019
java.util.concurrent for Distributed Coordination, JEEConf 2019java.util.concurrent for Distributed Coordination, JEEConf 2019
java.util.concurrent for Distributed Coordination, JEEConf 2019
 
Replication Distilled: Hazelcast Deep Dive @ In-Memory Computing Summit San F...
Replication Distilled: Hazelcast Deep Dive @ In-Memory Computing Summit San F...Replication Distilled: Hazelcast Deep Dive @ In-Memory Computing Summit San F...
Replication Distilled: Hazelcast Deep Dive @ In-Memory Computing Summit San F...
 
Replication Distilled: Hazelcast Deep Dive - Berlin Expert Days 2018
Replication Distilled: Hazelcast Deep Dive - Berlin Expert Days 2018Replication Distilled: Hazelcast Deep Dive - Berlin Expert Days 2018
Replication Distilled: Hazelcast Deep Dive - Berlin Expert Days 2018
 
Distributed Systems Theory for Mere Mortals - Software Craftsmanship Turkey
Distributed Systems Theory for Mere Mortals - Software Craftsmanship TurkeyDistributed Systems Theory for Mere Mortals - Software Craftsmanship Turkey
Distributed Systems Theory for Mere Mortals - Software Craftsmanship Turkey
 
Distributed Systems Theory for Mere Mortals - Topconf Dusseldorf October 2017
Distributed Systems Theory for Mere Mortals - Topconf Dusseldorf October 2017Distributed Systems Theory for Mere Mortals - Topconf Dusseldorf October 2017
Distributed Systems Theory for Mere Mortals - Topconf Dusseldorf October 2017
 
Ankara Jug - Practical Functional Programming with Scala
Ankara Jug - Practical Functional Programming with ScalaAnkara Jug - Practical Functional Programming with Scala
Ankara Jug - Practical Functional Programming with Scala
 

Recently uploaded

Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapRishantSharmaFr
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationBhangaleSonal
 
Unit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfUnit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfRagavanV2
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdfKamal Acharya
 
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...tanu pandey
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxJuliansyahHarahap1
 
2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projectssmsksolar
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayEpec Engineered Technologies
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaOmar Fathy
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.Kamal Acharya
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VDineshKumar4165
 
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...soginsider
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityMorshed Ahmed Rahath
 
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoorTop Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoordharasingh5698
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Arindam Chakraborty, Ph.D., P.E. (CA, TX)
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startQuintin Balsdon
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptMsecMca
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 

Recently uploaded (20)

Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
Unit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfUnit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdf
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
 
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 
2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoorTop Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 

Distributed Systems Theory for Mere Mortals - Java Day Istanbul May 2017

  • 1. Distributed Systems Theory for Mere Mortals Ensar Basri Kahveci 1 Java Day Istanbul May 2017
  • 2. I am Ensar Basri Kahveci Hello! Distributed Systems Engineer twitter & github website @ Hazelcast metanet basrikahveci.com 2
  • 3. Hazelcast ◉ Leading open source Java IMDG ◉ Distributed Java collections, JCache, HD store, …. ◉ Distributed computation and messaging ◉ Embedded or client-server deployment ◉ Integration modules & cloud friendly 3
  • 4. Hazelcast as a Distributed System ◉ Scale up & scale out ◉ Distributed from day one ◉ Dynamic clustering and elasticity ◉ Data partitioning and replication ◉ Fault tolerance 4
  • 5. ◉ Collection of entities trying to solve a common problem ○ Communication via passing messages ○ Uncertain and partial knowledge ◉ We need distributed systems mostly because: ○ Scalability ○ Fault tolerance Distributed Systems 5
  • 6. ◉ Independent failures ◉ Non-negligible message transmission delays ◉ Unreliable communication Main Difficulties 6
  • 7. ◉ System models come into play. ○ Interaction models ○ Failure modes ○ Notion of time ◉ Consensus Problem ◉ CAP Principle Systems Models 7
  • 8. ◉ Synchronous ◉ Partially synchronous ◉ Asynchronous Hazelcast embraces the partially-synchronous model. OperationTimeoutException Interaction Models 8
  • 9. Failure modes ◉ Crash-stop ◉ Omission faults ◉ Crash-recover ◉ Arbitrary failures (Byzantine) Hazelcast handles crash-recover failures by making them look like crash-stop failures. 9
  • 10. Time & Order: Physical time ◉ Time is often used to order events in distributed algorithms. ◉ Physical timestamps ○ LatestUpdateMapMergePolicy in Hazelcast ◉ Clock drifts can break latest update wins ◉ Google TrueTime 10
  • 11. Time & Order: Logical Clocks ◉ Logical clocks (Lamport clocks) ○ Local counters and communication ◉ Defines happens-before relationship. ○ (i.e., causality) Hazelcast extensively uses it along with primary-copy replication. 11
  • 12. Time & Order: Vector Clocks ◉ Inferring causality by comparing timestamps. ◉ Vector clocks are used to infer causality. ○ Dynamo-style databases use them to detect conflicts. Lamport clocks work fine for Hazelcast because there is only a single node performing the updates. 12
  • 13. Consensus ◉ The problem of having a set of processes agree on a value. ○ Leader election, state machine replication, strong consistency, distributed transactions, … ◉ Safety ◉ Liveness 13
  • 14. FLP Result ◉ In the asynchronous model, distributed consensus may not be solved within bounded time if at least one process can fail with crash-stop. ◉ It is because we cannot differentiate between a crashed process or a slow process. 14
  • 15. Unreliable Failure Detectors ◉ Local failure detectors which rely on timeouts and can make mistakes. ◉ Two types of mistakes: ○ suspecting from a running process ⇒ ACCURACY ○ not suspecting from a failed process ⇒ COMPLETENESS ◉ Different types of failure detectors. 15
  • 16. Two-Phase Commit (2PC) ◉ 2PC preserves safety, but it can lose liveness with crash-stop failures. 16
  • 17. Three-Phase Commit (3PC) ◉ 3PC tolerates crash-stop failures and preserves liveness, but can lose safety with network partitions or crash-recover failures. 17
  • 18. Majority Based Consensus ◉ Availability of majority is needed for liveness and safety. ○ 2f + 1 nodes tolerate failure of f nodes. ◉ Resiliency to crash-stop, network partitions and crash-recover. ◉ Paxos, Zab, Raft, Viewstamped Replication 18
  • 19. Consensus: Recap We have 2PC and 3PC in Hazelcast, but not majority based consensus algorithms. ◉ Consensus systems are mainly used for achieving strong consistency. 19
  • 20. CAP Principle ◉ Proposed by Eric Brewer in 2000. ○ Formally proved in 2002. ◉ A shared-data system cannot achieve perfect consistency and perfect availability in the presence of network partitions. ○ AP versus CP ◉ Widespread acceptance, and yet a lot of criticism. 20
  • 21. Consistency and Availability ◉ Levels of consistency: Data-centric (CP) Client-centric (AP) ◉ Levels of availability: High availability Sticky availability 21
  • 22. Hazelcast & CAP Principle ◉ Hazelcast is AP with primary-copy & async replication. primary-copy strong consistency on a stable cluster sticky availability async replication high throughput possibility of losing consistency on failures 22
  • 23. No Hocus Pocus ◉ A lot of variations in the abstractions and models. ◉ Learn the fundamentals, the rest will change anyway. 23