SlideShare a Scribd company logo
1 of 24
Download to read offline
1
DISTRIBUTED SYSTEMS THEORY
FOR MERE MORTALS
Kraków
May 2017
Ensar Basri Kahveci
I am Ensar Basri Kahveci
Hello!
Distributed Systems Engineer
twitter & github
website
@ Hazelcast
metanet
basrikahveci.com
2
Hazelcast
◉ Leading open source Java IMDG
◉ Distributed Java collections, JCache, …
◉ Distributed computations and messaging
◉ Embedded or client-server deployment
◉ Integration modules & cloud friendly
3
Hazelcast as a Distributed System
◉ Scale up & scale out
◉ Distributed from day one
◉ Dynamic clustering and elasticity
◉ Data partitioning and replication
◉ Fault tolerance and availability
4
◉ Collection of entities trying to solve a common
problem
○ Communication via passing messages
○ Uncertain and partial knowledge
◉ We need distributed systems mostly because:
○ Scalability
○ Fault tolerance
Distributed Systems
5
◉ Independent failures
◉ Non-negligible message transmission delays
◉ Unreliable communication
Main Difficulties
6
◉ Models and abstractions come into play.
○ Interaction models
○ Failure modes
○ Notion of time
◉ Consensus Problem
◉ CAP Principle
Systems Models
7
◉ Synchronous
◉ Partially synchronous
◉ Asynchronous
Hazelcast embraces the partially-synchronous model.
OperationTimeoutException
Interaction Models
8
Failure modes
◉ Crash-stop
◉ Omission faults
◉ Crash-recover
◉ Arbitrary failures (Byzantine)
Hazelcast handles crash-recover failures by
making them look like crash-stop failures.
9
Time & Order: Physical time
◉ Time is often used to order events in distributed
algorithms.
◉ Physical timestamps
○ LatestUpdateMapMergePolicy in Hazelcast
◉ Clock drifts can break latest update wins
◉ Google TrueTime
10
Time & Order: Logical Clocks
◉ Logical clocks (Lamport clocks)
○ Local counters and communication
◉ Defines happens-before relationship.
○ (i.e., causality)
Hazelcast extensively uses it along with
primary-copy replication.
11
Time & Order: Vector Clocks
◉ Inferring causality by comparing timestamps.
◉ Vector clocks are used to infer causality.
○ Dynamo-style databases use them to detect conflicts.
Lamport clocks work fine for Hazelcast because
there is only a single node performing the updates.
12
Consensus
◉ The problem of having a set of processes
agree on a value.
○ Leader election, state machine replication, strong
consistency, distributed transactions, …
◉ Safety
◉ Liveness
13
FLP Result
◉ In the asynchronous model, distributed consensus
may not be solved within bounded time if at least
one process can fail with crash-stop.
◉ It is because we cannot differentiate between a
crashed process or a slow process.
14
Unreliable Failure Detectors
◉ Local failure detectors which rely on timeouts and
can make mistakes.
◉ Two types of mistakes:
○ suspecting from a running process ⇒ ACCURACY
○ not suspecting from a failed process ⇒ COMPLETENESS
◉ Different types of failure detectors.
15
Two-Phase Commit (2PC)
◉ 2PC preserves safety, but it can lose liveness with
crash-stop failures.
16
Three-Phase Commit (3PC)
◉ 3PC tolerates crash-stop failures and preserves
liveness, but can lose safety with network partitions
or crash-recover failures.
17
Majority Based Consensus
◉ Availability of majority is needed for liveness
and safety.
○ 2f + 1 nodes tolerate failure of f nodes.
◉ Resiliency to crash-stop, network partitions
and crash-recover.
◉ Paxos, Zab, Raft, Viewstamped Replication
18
Consensus: Recap
We have 2PC and 3PC in Hazelcast, but not
majority based consensus algorithms.
◉ Consensus systems are mainly used for
achieving strong consistency.
19
CAP Principle
◉ Proposed by Eric Brewer in 2000.
○ Formally proved in 2002.
◉ A shared-data system cannot achieve perfect
consistency and perfect availability in the
presence of network partitions.
○ AP versus CP
◉ Widespread acceptance, and yet a lot of criticism.
20
Consistency and Availability
◉ Levels of consistency:
Data-centric (CP)
Client-centric (AP)
◉ Levels of availability:
High availability
Sticky availability
21
Hazelcast & CAP Principle
◉ Hazelcast is an AP product with
primary-copy & async replication.
primary-copy strong consistency
on a stable cluster
stickiness
async replication high throughput replication lag
22
No Hocus Pocus
◉ A lot of variations in the theory.
◉ Distributed systems are everywhere.
◉ Learn the fundamentals, the rest will change
anyway.
23
Any questions ?
Thanks!
24

More Related Content

What's hot

All you didn't know about the CAP theorem
All you didn't know about the CAP theoremAll you didn't know about the CAP theorem
All you didn't know about the CAP theoremKanstantsin Hontarau
 
Nondeterminism is unavoidable, but data races are pure evil
Nondeterminism is unavoidable, but data races are pure evilNondeterminism is unavoidable, but data races are pure evil
Nondeterminism is unavoidable, but data races are pure evilracesworkshop
 
Department of CSE, National University Bangladesh
Department of CSE, National University BangladeshDepartment of CSE, National University Bangladesh
Department of CSE, National University BangladeshTamanna Nayeema
 
Welcome to the nightmare of locking, blocking and isolation levels!
Welcome to the nightmare of locking, blocking and isolation levels!Welcome to the nightmare of locking, blocking and isolation levels!
Welcome to the nightmare of locking, blocking and isolation levels!Boris Hristov
 
New Multi-Hop Clustering Algorithm for Vehicular Ad Hoc Networks
New Multi-Hop Clustering Algorithm for Vehicular Ad Hoc NetworksNew Multi-Hop Clustering Algorithm for Vehicular Ad Hoc Networks
New Multi-Hop Clustering Algorithm for Vehicular Ad Hoc NetworksJAYAPRAKASH JPINFOTECH
 
CAP theorem and distributed systems
CAP theorem and distributed systemsCAP theorem and distributed systems
CAP theorem and distributed systemsKlika Tech, Inc
 
The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!Boris Hristov
 
The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!Boris Hristov
 
Process scheduling
Process schedulingProcess scheduling
Process schedulingmarangburu42
 
Multiprocessing -Interprocessing communication and process sunchronization,se...
Multiprocessing -Interprocessing communication and process sunchronization,se...Multiprocessing -Interprocessing communication and process sunchronization,se...
Multiprocessing -Interprocessing communication and process sunchronization,se...Neena R Krishna
 
The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!Boris Hristov
 
Grokking Techtalk #39: Gossip protocol and applications
Grokking Techtalk #39: Gossip protocol and applicationsGrokking Techtalk #39: Gossip protocol and applications
Grokking Techtalk #39: Gossip protocol and applicationsGrokking VN
 
The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!Boris Hristov
 
Deep Into Isolation Levels
Deep Into Isolation LevelsDeep Into Isolation Levels
Deep Into Isolation LevelsBoris Hristov
 
Welcome to the nightmare of locking, blocking and isolation levels!
Welcome to the nightmare of locking, blocking and isolation levels!Welcome to the nightmare of locking, blocking and isolation levels!
Welcome to the nightmare of locking, blocking and isolation levels!Boris Hristov
 
The nightmare of locking, blocking and isolation levels!
The nightmare of locking, blocking and isolation levels!The nightmare of locking, blocking and isolation levels!
The nightmare of locking, blocking and isolation levels!Boris Hristov
 
Client Centric Consistency Model
Client Centric Consistency ModelClient Centric Consistency Model
Client Centric Consistency ModelRajat Kumar
 
CAP Theorem and Split Brain Syndrome
CAP Theorem and Split Brain SyndromeCAP Theorem and Split Brain Syndrome
CAP Theorem and Split Brain SyndromeDilum Bandara
 
Replay your workload as it is your actual one!
Replay your workload as it is your actual one! Replay your workload as it is your actual one!
Replay your workload as it is your actual one! Boris Hristov
 

What's hot (20)

All you didn't know about the CAP theorem
All you didn't know about the CAP theoremAll you didn't know about the CAP theorem
All you didn't know about the CAP theorem
 
Nondeterminism is unavoidable, but data races are pure evil
Nondeterminism is unavoidable, but data races are pure evilNondeterminism is unavoidable, but data races are pure evil
Nondeterminism is unavoidable, but data races are pure evil
 
Department of CSE, National University Bangladesh
Department of CSE, National University BangladeshDepartment of CSE, National University Bangladesh
Department of CSE, National University Bangladesh
 
Welcome to the nightmare of locking, blocking and isolation levels!
Welcome to the nightmare of locking, blocking and isolation levels!Welcome to the nightmare of locking, blocking and isolation levels!
Welcome to the nightmare of locking, blocking and isolation levels!
 
New Multi-Hop Clustering Algorithm for Vehicular Ad Hoc Networks
New Multi-Hop Clustering Algorithm for Vehicular Ad Hoc NetworksNew Multi-Hop Clustering Algorithm for Vehicular Ad Hoc Networks
New Multi-Hop Clustering Algorithm for Vehicular Ad Hoc Networks
 
CAP theorem and distributed systems
CAP theorem and distributed systemsCAP theorem and distributed systems
CAP theorem and distributed systems
 
The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!
 
The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!
 
Process scheduling
Process schedulingProcess scheduling
Process scheduling
 
Multiprocessing -Interprocessing communication and process sunchronization,se...
Multiprocessing -Interprocessing communication and process sunchronization,se...Multiprocessing -Interprocessing communication and process sunchronization,se...
Multiprocessing -Interprocessing communication and process sunchronization,se...
 
The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!
 
Grokking Techtalk #39: Gossip protocol and applications
Grokking Techtalk #39: Gossip protocol and applicationsGrokking Techtalk #39: Gossip protocol and applications
Grokking Techtalk #39: Gossip protocol and applications
 
Concurrency
ConcurrencyConcurrency
Concurrency
 
The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!The Nightmare of Locking, Blocking and Isolation Levels!
The Nightmare of Locking, Blocking and Isolation Levels!
 
Deep Into Isolation Levels
Deep Into Isolation LevelsDeep Into Isolation Levels
Deep Into Isolation Levels
 
Welcome to the nightmare of locking, blocking and isolation levels!
Welcome to the nightmare of locking, blocking and isolation levels!Welcome to the nightmare of locking, blocking and isolation levels!
Welcome to the nightmare of locking, blocking and isolation levels!
 
The nightmare of locking, blocking and isolation levels!
The nightmare of locking, blocking and isolation levels!The nightmare of locking, blocking and isolation levels!
The nightmare of locking, blocking and isolation levels!
 
Client Centric Consistency Model
Client Centric Consistency ModelClient Centric Consistency Model
Client Centric Consistency Model
 
CAP Theorem and Split Brain Syndrome
CAP Theorem and Split Brain SyndromeCAP Theorem and Split Brain Syndrome
CAP Theorem and Split Brain Syndrome
 
Replay your workload as it is your actual one!
Replay your workload as it is your actual one! Replay your workload as it is your actual one!
Replay your workload as it is your actual one!
 

Similar to Distributed Systems Theory for Mere Mortals - GeeCON Krakow May 2017

Concurrent Programming in Java
Concurrent Programming in JavaConcurrent Programming in Java
Concurrent Programming in JavaLakshmi Narasimhan
 
Fault tolerance - look, it's simple!
Fault tolerance - look, it's simple!Fault tolerance - look, it's simple!
Fault tolerance - look, it's simple!Izzet Mustafaiev
 
Storing the real world data
Storing the real world dataStoring the real world data
Storing the real world dataAthira Mukundan
 
Reactive by example (DevOpsDaysTLV 2019)
Reactive by example (DevOpsDaysTLV 2019)Reactive by example (DevOpsDaysTLV 2019)
Reactive by example (DevOpsDaysTLV 2019)Eran Harel
 
Designing large scale distributed systems
Designing large scale distributed systemsDesigning large scale distributed systems
Designing large scale distributed systemsAshwani Priyedarshi
 
From AP to CP and Back: The Curious Case of Hazelcast (jdk.io 2018)
From AP to CP and Back: The Curious Case of Hazelcast (jdk.io 2018)From AP to CP and Back: The Curious Case of Hazelcast (jdk.io 2018)
From AP to CP and Back: The Curious Case of Hazelcast (jdk.io 2018)Ensar Basri Kahveci
 
From Mainframe to Microservice: An Introduction to Distributed Systems
From Mainframe to Microservice: An Introduction to Distributed SystemsFrom Mainframe to Microservice: An Introduction to Distributed Systems
From Mainframe to Microservice: An Introduction to Distributed SystemsTyler Treat
 
CAP Theorem - Theory, Implications and Practices
CAP Theorem - Theory, Implications and PracticesCAP Theorem - Theory, Implications and Practices
CAP Theorem - Theory, Implications and PracticesYoav Francis
 
Distributed computing for new bloods
Distributed computing for new bloodsDistributed computing for new bloods
Distributed computing for new bloodsRaymond Tay
 
ESORICS 2014: Local Password validation using Self-Organizing Maps
ESORICS 2014: Local Password validation using Self-Organizing MapsESORICS 2014: Local Password validation using Self-Organizing Maps
ESORICS 2014: Local Password validation using Self-Organizing MapsDiogo Mónica
 
containing byzantine failures with control zones
containing  byzantine failures with control zonescontaining  byzantine failures with control zones
containing byzantine failures with control zonesvishnuRajan20
 
The Supporting Role of Antivirus Evasion while Persisting
The Supporting Role of Antivirus Evasion while PersistingThe Supporting Role of Antivirus Evasion while Persisting
The Supporting Role of Antivirus Evasion while PersistingCTruncer
 
Fault tolerance review by tsegabrehan zerihun
Fault tolerance review by tsegabrehan zerihunFault tolerance review by tsegabrehan zerihun
Fault tolerance review by tsegabrehan zerihunTsegabrehan Am
 
Message Passing, Remote Procedure Calls and Distributed Shared Memory as Com...
Message Passing, Remote Procedure Calls and  Distributed Shared Memory as Com...Message Passing, Remote Procedure Calls and  Distributed Shared Memory as Com...
Message Passing, Remote Procedure Calls and Distributed Shared Memory as Com...Sehrish Asif
 
01-distributed-systems-primer.pdf
01-distributed-systems-primer.pdf01-distributed-systems-primer.pdf
01-distributed-systems-primer.pdfBehroozDadkhah
 
Clustering in PostgreSQL - Because one database server is never enough (and n...
Clustering in PostgreSQL - Because one database server is never enough (and n...Clustering in PostgreSQL - Because one database server is never enough (and n...
Clustering in PostgreSQL - Because one database server is never enough (and n...Umair Shahid
 

Similar to Distributed Systems Theory for Mere Mortals - GeeCON Krakow May 2017 (20)

Concurrency in Java
Concurrency in JavaConcurrency in Java
Concurrency in Java
 
Concurrent Programming in Java
Concurrent Programming in JavaConcurrent Programming in Java
Concurrent Programming in Java
 
Fault tolerance - look, it's simple!
Fault tolerance - look, it's simple!Fault tolerance - look, it's simple!
Fault tolerance - look, it's simple!
 
Storing the real world data
Storing the real world dataStoring the real world data
Storing the real world data
 
Reactive by example (DevOpsDaysTLV 2019)
Reactive by example (DevOpsDaysTLV 2019)Reactive by example (DevOpsDaysTLV 2019)
Reactive by example (DevOpsDaysTLV 2019)
 
Designing large scale distributed systems
Designing large scale distributed systemsDesigning large scale distributed systems
Designing large scale distributed systems
 
From AP to CP and Back: The Curious Case of Hazelcast (jdk.io 2018)
From AP to CP and Back: The Curious Case of Hazelcast (jdk.io 2018)From AP to CP and Back: The Curious Case of Hazelcast (jdk.io 2018)
From AP to CP and Back: The Curious Case of Hazelcast (jdk.io 2018)
 
CQRS: Theory
CQRS: Theory CQRS: Theory
CQRS: Theory
 
From Mainframe to Microservice: An Introduction to Distributed Systems
From Mainframe to Microservice: An Introduction to Distributed SystemsFrom Mainframe to Microservice: An Introduction to Distributed Systems
From Mainframe to Microservice: An Introduction to Distributed Systems
 
CAP Theorem - Theory, Implications and Practices
CAP Theorem - Theory, Implications and PracticesCAP Theorem - Theory, Implications and Practices
CAP Theorem - Theory, Implications and Practices
 
Distributed computing for new bloods
Distributed computing for new bloodsDistributed computing for new bloods
Distributed computing for new bloods
 
The Google file system
The Google file systemThe Google file system
The Google file system
 
ESORICS 2014: Local Password validation using Self-Organizing Maps
ESORICS 2014: Local Password validation using Self-Organizing MapsESORICS 2014: Local Password validation using Self-Organizing Maps
ESORICS 2014: Local Password validation using Self-Organizing Maps
 
containing byzantine failures with control zones
containing  byzantine failures with control zonescontaining  byzantine failures with control zones
containing byzantine failures with control zones
 
Byzantine
ByzantineByzantine
Byzantine
 
The Supporting Role of Antivirus Evasion while Persisting
The Supporting Role of Antivirus Evasion while PersistingThe Supporting Role of Antivirus Evasion while Persisting
The Supporting Role of Antivirus Evasion while Persisting
 
Fault tolerance review by tsegabrehan zerihun
Fault tolerance review by tsegabrehan zerihunFault tolerance review by tsegabrehan zerihun
Fault tolerance review by tsegabrehan zerihun
 
Message Passing, Remote Procedure Calls and Distributed Shared Memory as Com...
Message Passing, Remote Procedure Calls and  Distributed Shared Memory as Com...Message Passing, Remote Procedure Calls and  Distributed Shared Memory as Com...
Message Passing, Remote Procedure Calls and Distributed Shared Memory as Com...
 
01-distributed-systems-primer.pdf
01-distributed-systems-primer.pdf01-distributed-systems-primer.pdf
01-distributed-systems-primer.pdf
 
Clustering in PostgreSQL - Because one database server is never enough (and n...
Clustering in PostgreSQL - Because one database server is never enough (and n...Clustering in PostgreSQL - Because one database server is never enough (and n...
Clustering in PostgreSQL - Because one database server is never enough (and n...
 

More from Ensar Basri Kahveci

java.util.concurrent for Distributed Coordination - Berlin Expert Days 2019
java.util.concurrent for Distributed Coordination - Berlin Expert Days 2019java.util.concurrent for Distributed Coordination - Berlin Expert Days 2019
java.util.concurrent for Distributed Coordination - Berlin Expert Days 2019Ensar Basri Kahveci
 
java.util.concurrent for Distributed Coordination, Riga DevDays 2019
java.util.concurrent for Distributed Coordination, Riga DevDays 2019java.util.concurrent for Distributed Coordination, Riga DevDays 2019
java.util.concurrent for Distributed Coordination, Riga DevDays 2019Ensar Basri Kahveci
 
java.util.concurrent for Distributed Coordination, GeeCON Krakow 2019
java.util.concurrent for Distributed Coordination, GeeCON Krakow 2019java.util.concurrent for Distributed Coordination, GeeCON Krakow 2019
java.util.concurrent for Distributed Coordination, GeeCON Krakow 2019Ensar Basri Kahveci
 
java.util.concurrent for Distributed Coordination, JEEConf 2019
java.util.concurrent for Distributed Coordination, JEEConf 2019java.util.concurrent for Distributed Coordination, JEEConf 2019
java.util.concurrent for Distributed Coordination, JEEConf 2019Ensar Basri Kahveci
 
Replication Distilled: Hazelcast Deep Dive @ In-Memory Computing Summit San F...
Replication Distilled: Hazelcast Deep Dive @ In-Memory Computing Summit San F...Replication Distilled: Hazelcast Deep Dive @ In-Memory Computing Summit San F...
Replication Distilled: Hazelcast Deep Dive @ In-Memory Computing Summit San F...Ensar Basri Kahveci
 
Replication Distilled: Hazelcast Deep Dive - Berlin Expert Days 2018
Replication Distilled: Hazelcast Deep Dive - Berlin Expert Days 2018Replication Distilled: Hazelcast Deep Dive - Berlin Expert Days 2018
Replication Distilled: Hazelcast Deep Dive - Berlin Expert Days 2018Ensar Basri Kahveci
 
Distributed Systems Theory for Mere Mortals - Software Craftsmanship Turkey
Distributed Systems Theory for Mere Mortals - Software Craftsmanship TurkeyDistributed Systems Theory for Mere Mortals - Software Craftsmanship Turkey
Distributed Systems Theory for Mere Mortals - Software Craftsmanship TurkeyEnsar Basri Kahveci
 
Distributed Systems Theory for Mere Mortals - Topconf Dusseldorf October 2017
Distributed Systems Theory for Mere Mortals - Topconf Dusseldorf October 2017Distributed Systems Theory for Mere Mortals - Topconf Dusseldorf October 2017
Distributed Systems Theory for Mere Mortals - Topconf Dusseldorf October 2017Ensar Basri Kahveci
 
Ankara Jug - Practical Functional Programming with Scala
Ankara Jug - Practical Functional Programming with ScalaAnkara Jug - Practical Functional Programming with Scala
Ankara Jug - Practical Functional Programming with ScalaEnsar Basri Kahveci
 

More from Ensar Basri Kahveci (9)

java.util.concurrent for Distributed Coordination - Berlin Expert Days 2019
java.util.concurrent for Distributed Coordination - Berlin Expert Days 2019java.util.concurrent for Distributed Coordination - Berlin Expert Days 2019
java.util.concurrent for Distributed Coordination - Berlin Expert Days 2019
 
java.util.concurrent for Distributed Coordination, Riga DevDays 2019
java.util.concurrent for Distributed Coordination, Riga DevDays 2019java.util.concurrent for Distributed Coordination, Riga DevDays 2019
java.util.concurrent for Distributed Coordination, Riga DevDays 2019
 
java.util.concurrent for Distributed Coordination, GeeCON Krakow 2019
java.util.concurrent for Distributed Coordination, GeeCON Krakow 2019java.util.concurrent for Distributed Coordination, GeeCON Krakow 2019
java.util.concurrent for Distributed Coordination, GeeCON Krakow 2019
 
java.util.concurrent for Distributed Coordination, JEEConf 2019
java.util.concurrent for Distributed Coordination, JEEConf 2019java.util.concurrent for Distributed Coordination, JEEConf 2019
java.util.concurrent for Distributed Coordination, JEEConf 2019
 
Replication Distilled: Hazelcast Deep Dive @ In-Memory Computing Summit San F...
Replication Distilled: Hazelcast Deep Dive @ In-Memory Computing Summit San F...Replication Distilled: Hazelcast Deep Dive @ In-Memory Computing Summit San F...
Replication Distilled: Hazelcast Deep Dive @ In-Memory Computing Summit San F...
 
Replication Distilled: Hazelcast Deep Dive - Berlin Expert Days 2018
Replication Distilled: Hazelcast Deep Dive - Berlin Expert Days 2018Replication Distilled: Hazelcast Deep Dive - Berlin Expert Days 2018
Replication Distilled: Hazelcast Deep Dive - Berlin Expert Days 2018
 
Distributed Systems Theory for Mere Mortals - Software Craftsmanship Turkey
Distributed Systems Theory for Mere Mortals - Software Craftsmanship TurkeyDistributed Systems Theory for Mere Mortals - Software Craftsmanship Turkey
Distributed Systems Theory for Mere Mortals - Software Craftsmanship Turkey
 
Distributed Systems Theory for Mere Mortals - Topconf Dusseldorf October 2017
Distributed Systems Theory for Mere Mortals - Topconf Dusseldorf October 2017Distributed Systems Theory for Mere Mortals - Topconf Dusseldorf October 2017
Distributed Systems Theory for Mere Mortals - Topconf Dusseldorf October 2017
 
Ankara Jug - Practical Functional Programming with Scala
Ankara Jug - Practical Functional Programming with ScalaAnkara Jug - Practical Functional Programming with Scala
Ankara Jug - Practical Functional Programming with Scala
 

Recently uploaded

Turn leadership mistakes into a better future.pptx
Turn leadership mistakes into a better future.pptxTurn leadership mistakes into a better future.pptx
Turn leadership mistakes into a better future.pptxStephen Sitton
 
Javier_Fernandez_CARS_workshop_presentation.pptx
Javier_Fernandez_CARS_workshop_presentation.pptxJavier_Fernandez_CARS_workshop_presentation.pptx
Javier_Fernandez_CARS_workshop_presentation.pptxJavier Fernández Muñoz
 
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptx
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptxTriangulation survey (Basic Mine Surveying)_MI10412MI.pptx
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptxRomil Mishra
 
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.elesangwon
 
Introduction to Machine Learning Part1.pptx
Introduction to Machine Learning Part1.pptxIntroduction to Machine Learning Part1.pptx
Introduction to Machine Learning Part1.pptxPavan Mohan Neelamraju
 
Uk-NO1 Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Exp...
Uk-NO1 Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Exp...Uk-NO1 Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Exp...
Uk-NO1 Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Exp...Amil baba
 
Cost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based questionCost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based questionSneha Padhiar
 
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfComprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfalene1
 
Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________Romil Mishra
 
Machine Learning 5G Federated Learning.pdf
Machine Learning 5G Federated Learning.pdfMachine Learning 5G Federated Learning.pdf
Machine Learning 5G Federated Learning.pdfadeyimikaipaye
 
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...Sumanth A
 
Theory of Machine Notes / Lecture Material .pdf
Theory of Machine Notes / Lecture Material .pdfTheory of Machine Notes / Lecture Material .pdf
Theory of Machine Notes / Lecture Material .pdfShreyas Pandit
 
Secure Key Crypto - Tech Paper JET Tech Labs
Secure Key Crypto - Tech Paper JET Tech LabsSecure Key Crypto - Tech Paper JET Tech Labs
Secure Key Crypto - Tech Paper JET Tech Labsamber724300
 
Livre Implementing_Six_Sigma_and_Lean_A_prac([Ron_Basu]_).pdf
Livre Implementing_Six_Sigma_and_Lean_A_prac([Ron_Basu]_).pdfLivre Implementing_Six_Sigma_and_Lean_A_prac([Ron_Basu]_).pdf
Livre Implementing_Six_Sigma_and_Lean_A_prac([Ron_Basu]_).pdfsaad175691
 
KCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitosKCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitosVictor Morales
 
70 POWER PLANT IAE V2500 technical training
70 POWER PLANT IAE V2500 technical training70 POWER PLANT IAE V2500 technical training
70 POWER PLANT IAE V2500 technical trainingGladiatorsKasper
 
Submerged Combustion, Explosion Flame Combustion, Pulsating Combustion, and E...
Submerged Combustion, Explosion Flame Combustion, Pulsating Combustion, and E...Submerged Combustion, Explosion Flame Combustion, Pulsating Combustion, and E...
Submerged Combustion, Explosion Flame Combustion, Pulsating Combustion, and E...Ayisha586983
 
Artificial Intelligence in Power System overview
Artificial Intelligence in Power System overviewArtificial Intelligence in Power System overview
Artificial Intelligence in Power System overviewsandhya757531
 
AntColonyOptimizationManetNetworkAODV.pptx
AntColonyOptimizationManetNetworkAODV.pptxAntColonyOptimizationManetNetworkAODV.pptx
AntColonyOptimizationManetNetworkAODV.pptxLina Kadam
 

Recently uploaded (20)

Turn leadership mistakes into a better future.pptx
Turn leadership mistakes into a better future.pptxTurn leadership mistakes into a better future.pptx
Turn leadership mistakes into a better future.pptx
 
Javier_Fernandez_CARS_workshop_presentation.pptx
Javier_Fernandez_CARS_workshop_presentation.pptxJavier_Fernandez_CARS_workshop_presentation.pptx
Javier_Fernandez_CARS_workshop_presentation.pptx
 
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptx
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptxTriangulation survey (Basic Mine Surveying)_MI10412MI.pptx
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptx
 
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.
 
Introduction to Machine Learning Part1.pptx
Introduction to Machine Learning Part1.pptxIntroduction to Machine Learning Part1.pptx
Introduction to Machine Learning Part1.pptx
 
Uk-NO1 Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Exp...
Uk-NO1 Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Exp...Uk-NO1 Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Exp...
Uk-NO1 Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Exp...
 
Cost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based questionCost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based question
 
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfComprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
 
Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________
 
Machine Learning 5G Federated Learning.pdf
Machine Learning 5G Federated Learning.pdfMachine Learning 5G Federated Learning.pdf
Machine Learning 5G Federated Learning.pdf
 
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
 
Theory of Machine Notes / Lecture Material .pdf
Theory of Machine Notes / Lecture Material .pdfTheory of Machine Notes / Lecture Material .pdf
Theory of Machine Notes / Lecture Material .pdf
 
Secure Key Crypto - Tech Paper JET Tech Labs
Secure Key Crypto - Tech Paper JET Tech LabsSecure Key Crypto - Tech Paper JET Tech Labs
Secure Key Crypto - Tech Paper JET Tech Labs
 
Livre Implementing_Six_Sigma_and_Lean_A_prac([Ron_Basu]_).pdf
Livre Implementing_Six_Sigma_and_Lean_A_prac([Ron_Basu]_).pdfLivre Implementing_Six_Sigma_and_Lean_A_prac([Ron_Basu]_).pdf
Livre Implementing_Six_Sigma_and_Lean_A_prac([Ron_Basu]_).pdf
 
Versatile Engineering Construction Firms
Versatile Engineering Construction FirmsVersatile Engineering Construction Firms
Versatile Engineering Construction Firms
 
KCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitosKCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitos
 
70 POWER PLANT IAE V2500 technical training
70 POWER PLANT IAE V2500 technical training70 POWER PLANT IAE V2500 technical training
70 POWER PLANT IAE V2500 technical training
 
Submerged Combustion, Explosion Flame Combustion, Pulsating Combustion, and E...
Submerged Combustion, Explosion Flame Combustion, Pulsating Combustion, and E...Submerged Combustion, Explosion Flame Combustion, Pulsating Combustion, and E...
Submerged Combustion, Explosion Flame Combustion, Pulsating Combustion, and E...
 
Artificial Intelligence in Power System overview
Artificial Intelligence in Power System overviewArtificial Intelligence in Power System overview
Artificial Intelligence in Power System overview
 
AntColonyOptimizationManetNetworkAODV.pptx
AntColonyOptimizationManetNetworkAODV.pptxAntColonyOptimizationManetNetworkAODV.pptx
AntColonyOptimizationManetNetworkAODV.pptx
 

Distributed Systems Theory for Mere Mortals - GeeCON Krakow May 2017

  • 1. 1 DISTRIBUTED SYSTEMS THEORY FOR MERE MORTALS Kraków May 2017 Ensar Basri Kahveci
  • 2. I am Ensar Basri Kahveci Hello! Distributed Systems Engineer twitter & github website @ Hazelcast metanet basrikahveci.com 2
  • 3. Hazelcast ◉ Leading open source Java IMDG ◉ Distributed Java collections, JCache, … ◉ Distributed computations and messaging ◉ Embedded or client-server deployment ◉ Integration modules & cloud friendly 3
  • 4. Hazelcast as a Distributed System ◉ Scale up & scale out ◉ Distributed from day one ◉ Dynamic clustering and elasticity ◉ Data partitioning and replication ◉ Fault tolerance and availability 4
  • 5. ◉ Collection of entities trying to solve a common problem ○ Communication via passing messages ○ Uncertain and partial knowledge ◉ We need distributed systems mostly because: ○ Scalability ○ Fault tolerance Distributed Systems 5
  • 6. ◉ Independent failures ◉ Non-negligible message transmission delays ◉ Unreliable communication Main Difficulties 6
  • 7. ◉ Models and abstractions come into play. ○ Interaction models ○ Failure modes ○ Notion of time ◉ Consensus Problem ◉ CAP Principle Systems Models 7
  • 8. ◉ Synchronous ◉ Partially synchronous ◉ Asynchronous Hazelcast embraces the partially-synchronous model. OperationTimeoutException Interaction Models 8
  • 9. Failure modes ◉ Crash-stop ◉ Omission faults ◉ Crash-recover ◉ Arbitrary failures (Byzantine) Hazelcast handles crash-recover failures by making them look like crash-stop failures. 9
  • 10. Time & Order: Physical time ◉ Time is often used to order events in distributed algorithms. ◉ Physical timestamps ○ LatestUpdateMapMergePolicy in Hazelcast ◉ Clock drifts can break latest update wins ◉ Google TrueTime 10
  • 11. Time & Order: Logical Clocks ◉ Logical clocks (Lamport clocks) ○ Local counters and communication ◉ Defines happens-before relationship. ○ (i.e., causality) Hazelcast extensively uses it along with primary-copy replication. 11
  • 12. Time & Order: Vector Clocks ◉ Inferring causality by comparing timestamps. ◉ Vector clocks are used to infer causality. ○ Dynamo-style databases use them to detect conflicts. Lamport clocks work fine for Hazelcast because there is only a single node performing the updates. 12
  • 13. Consensus ◉ The problem of having a set of processes agree on a value. ○ Leader election, state machine replication, strong consistency, distributed transactions, … ◉ Safety ◉ Liveness 13
  • 14. FLP Result ◉ In the asynchronous model, distributed consensus may not be solved within bounded time if at least one process can fail with crash-stop. ◉ It is because we cannot differentiate between a crashed process or a slow process. 14
  • 15. Unreliable Failure Detectors ◉ Local failure detectors which rely on timeouts and can make mistakes. ◉ Two types of mistakes: ○ suspecting from a running process ⇒ ACCURACY ○ not suspecting from a failed process ⇒ COMPLETENESS ◉ Different types of failure detectors. 15
  • 16. Two-Phase Commit (2PC) ◉ 2PC preserves safety, but it can lose liveness with crash-stop failures. 16
  • 17. Three-Phase Commit (3PC) ◉ 3PC tolerates crash-stop failures and preserves liveness, but can lose safety with network partitions or crash-recover failures. 17
  • 18. Majority Based Consensus ◉ Availability of majority is needed for liveness and safety. ○ 2f + 1 nodes tolerate failure of f nodes. ◉ Resiliency to crash-stop, network partitions and crash-recover. ◉ Paxos, Zab, Raft, Viewstamped Replication 18
  • 19. Consensus: Recap We have 2PC and 3PC in Hazelcast, but not majority based consensus algorithms. ◉ Consensus systems are mainly used for achieving strong consistency. 19
  • 20. CAP Principle ◉ Proposed by Eric Brewer in 2000. ○ Formally proved in 2002. ◉ A shared-data system cannot achieve perfect consistency and perfect availability in the presence of network partitions. ○ AP versus CP ◉ Widespread acceptance, and yet a lot of criticism. 20
  • 21. Consistency and Availability ◉ Levels of consistency: Data-centric (CP) Client-centric (AP) ◉ Levels of availability: High availability Sticky availability 21
  • 22. Hazelcast & CAP Principle ◉ Hazelcast is an AP product with primary-copy & async replication. primary-copy strong consistency on a stable cluster stickiness async replication high throughput replication lag 22
  • 23. No Hocus Pocus ◉ A lot of variations in the theory. ◉ Distributed systems are everywhere. ◉ Learn the fundamentals, the rest will change anyway. 23