Distributed Systems Concepts

Distributed Systems Concepts
Jordan Halterman
Intro to

I N T R O D U C T I O N
3
W H AT I S A
D I S T R I B U T E D
S Y S T E M ?

4
A N AT O M Y O F A
S Y S T E MW H AT I S A
S Y S T E M ?

5
A N AT O M Y O F A
S Y S T E M FA L L A C I E S O F
C O M P U T I N G
W H AT I S A
S Y S T E M ?

A collection of independent computers
that appear to users as a single coherent
system
W H AT I S A D I S T R I B U T E D S Y S T E M ?
6

“A collection of independent
computers that appear to the users
of the system as a single computer”
— Andrew Tanenbaum
7

“You know you have a distributed
system when the crash of a
computer you’ve never heard of
stops you from getting any work
done”
— Leslie Lamport
8

• Scalability and fault tolerance
• Memory, disk, and CPU are ﬁnite resources
• Computers crash and networks fail
• Science hasn’t kept up with technological needs
9

B U T
S Y S T E M S A R E
H A R D !
1 0

T H E T W O G E N E R A L S P R O B L E M
1 1
• Two generals on the opposite sides of a valley have to
coordinate to decide when to attack
• Each general must be sure the other made the same
decision
• Generals can only communicate through messages
• Messengers sent through the valley can be captured

A N AT O M Y O F A
D I S T R I B U T E D S Y S T E M
1 2

Nodes
A N AT O M Y O F A D I S T R I B U T E D
S Y S T E M
1 3

Nodes
Networks
S Y S T E M
1 4

Nodes
Networks
Protocols
S Y S T E M
1 5

• Each independent component of a distributed
system is called a node
• Also known as a process, agent or actor
• Operations within a node are fast
• Communication between nodes is slow
• Operations generally occur in order
N O D E S
1 6

S Y S T E M M O D E L
1 7
• Bounded message delays
• Accurate global clock
• Easy to reason about
• You don’t have one
ASYNCHRONOUSSYNCHRONOUS
• Processes execute
independently
• Unbounded message delays
• No global clock
• Difficult to reason about
• You have one

• Nodes communicate via messages
• Example: UDP, TCP, HTTP
M E S S A G E PA S S I N G
1 8

FA L L A C I E S O F
C O M P U T I N G
1 9

The network is reliable
FA L L A C I E S O F D I S T R I B U T E D
C O M P U T I N G
2 0

Latency is zero
C O M P U T I N G
2 1

Latency is zero
Bandwidth is inﬁnite
C O M P U T I N G
2 2

Latency is zero
The network is secure
C O M P U T I N G
2 3

Latency is zero
C O M P U T I N G
2 4
Topology doesn’t change

Latency is zero
C O M P U T I N G
2 5
There is one administrator

Latency is zero
C O M P U T I N G
2 6
Transport cost is zero

Latency is zero
C O M P U T I N G
2 7
Transport cost is zero
The network is homogeneous

FALLACY #1
THE NETWORK IS
RELIABLE
C O M P U T I N G
2 8

• On average 5.2 devices and 40.8 lines fail per day in
Microsoft data centers
• The majority of Google’s outages that lasted more than 30
seconds were due to network maintenance or connectivity
issues
• If network hardware doesn’t fail, software will
• We cannot rely on the network to deliver our communications
C O M P U T I N G
2 9

FALLACY #2
LATENCY IS ZERO
C O M P U T I N G
3 0

• Latency is the time it takes for a signal to travel from one
computer to another
• Latency is a function of the speed of light
• It takes 40 milliseconds for light to travel from New York to
Paris and back
• The JVM executes billions of instructions per second
C O M P U T I N G
3 1

FALLACY #3
BANDWIDTH IS
INFINITE
C O M P U T I N G
3 2

• Bandwidth is roughly the amount of information that can
be transmitted each second
• Networks are limited by hardware
• Applications are limited by software
C O M P U T I N G
3 3

FALLACY #4
THE NETWORK IS
SECURE
C O M P U T I N G
3 4

• We see hacks of major corporations’ networks seemingly
on a weekly basis
• In 2015, Foxglove Security discovered a major
vulnerability in Java’s serialization framework
• Allowing remote access to friendly users opens systems up
to unfriendly ones
C O M P U T I N G
3 5

C O M P U T I N G
3 6
D ATA B R E A C H E S S I N C E 2 0 0 5

FALLACY #5
TOPOLOGY DOESN’T
CHANGE
C O M P U T I N G
3 7

• Administrators add and remove servers from networks
• We cannot depend on machines always being in the same
place
• Service discovery and routing layers solve this problem
C O M P U T I N G
3 8

FALLACY #6
THERE IS ONE
ADMINISTRATOR
C O M P U T I N G
3 9

• Production systems are often maintained and managed by
numerous people
• Multiple administrators may institute conﬂicting policies
C O M P U T I N G
4 0

FALLACY #7
TRANSPORT COST IS
ZERO
C O M P U T I N G
4 1

• Local processing is cheap
• Network communication is expensive
• Latency and bandwidth ensure transport cost is never zero
C O M P U T I N G
4 2

FALLACY #8
THE NETWORK IS
HOMOGENEOUS
C O M P U T I N G
4 3

• Applications must be designed to work in a variety of
environments
• Wired networks
• Wireless networks
• Cellular networks
• Satellite networks
C O M P U T I N G
4 4

C O N C E P T S
4 6
T I M E I N
S Y S T E M S

C O N C E P T S
4 7
C O N S I S T E N C Y I N
S Y S T E M ST I M E I N
S Y S T E M S

C O N C E P T S
4 8
C O N S I S T E N C Y I N
S Y S T E M S PA R T I T I O N I N G
A N D
R E P L I C AT I O N
T I M E I N
S Y S T E M S

CONSISTENCY AVAILABILITY PARTITION TOLERANCE
ZOOKEEPER STRONG QUORUM YES
DYNAMO EVENTUALLY STRONG HIGH YES
MYSQL STRONG HIGH NO
T H E C A P T H E O R E M
4 9
T R A D E O F F S I N D I S T R I B U T E D S Y S T E M S

O R D E R I N D I S T R I B U T E D
S Y S T E M S
5 0

• Order is necessary to enforce causal relationships
• Two types of order in distributed systems
• Partial order
• Order of dependent events
• Total order
• Order of all events
• Single-threaded applications are totally ordered
O R D E R I N D I S T R I B U T E D S Y S T E M S
5 1

T I M E I N D I S T R I B U T E D
S Y S T E M S
5 2

• Time can be used to enforce order
• Time can be used to enforce bounds on communications
• But time progresses independently in asynchronous
systems
• Clocks suffer from clock drift
• Even NTP can only synchronize clocks to within a few
milliseconds of each other
T I M E I N D I S T R I B U T E D S Y S T E M S
5 3

5 4

• “Time, Clocks, and the Ordering of Events in a Distributed
System”
• Developed by Leslie Lamport in 1978
• One of the seminal papers in distributed systems
• Determines partial ordering of events in a distributed
system
• Also referred to as logical clocks
5 5
L A M P O R T C L O C K S

5 6

• “Timestamps in Message Passing Systems That Preserve the
Partial Ordering” - Colin J. Fidge
• “Virtual Time and Global States of Distributed Systems” -
Friedemann Mattern
• Independently developed by two researchers in 1988
• Determines causal ordering of events in a distributed system
• Also referred to as version vectors
5 7
V E C T O R C L O C K S

5 8

• Linearizability
• Sequential consistency
• Causal consistency
• Eventual strong consistency
• Eventual consistency
C O N S I S T E N C Y M O D E L S
6 0

• Monotonic read consistency
• Monotonic write consistency
• Read-your-writes consistency
• Writes follow reads consistency
• Serializability
C O N S I S T E N C Y M O D E L S
6 1
M O R E C O N S I S T E N C Y M O D E L S

• Split data across multiple machines
• Reduces the amount of data each node must handle
• Reduces the amount of network I/O for certain algorithms
PA R T I T I O N I N G
6 3

• Sharing information to ensure consistency between
redundant services
• Active replication — push
• Passive replication — pull
• Quorum-based
• Gossip
6 5

6 6
• Nodes updated between the
request and response
• Consistency over
performance
A S Y N C H R O N O U SS Y N C H R O N O U S
• State persisted locally and
replicated after response
• Performance over
consistency

PRIMARY-BACKUP GOSSIP 2PC QUORUM
CONSISTENCY
TRANSACTIONS
LATENCY
THROUGHPUT
DATA LOSS
READ ONLY
E V E N T U A L S T R O N G
L O W
H I G H
F U L L
H I G H
F U L L L O C A L
S O M E
R E A D O N LY
L O W M E D I U M
N O N E
R E A D / W R I T E
6 7
T R A D E O F F S I N D I S T R I B U T E D S Y S T E M S

• Gossip is one of the simplest distributed communication
algorithms
• Inspired by the gossip that takes place in human communication
• Each node periodically chooses a random set of neighbors with
which to exchange information
• Information propagates through the system quickly
• Version vectors can be used to resolve conﬂicts in updates
6 8
G O S S I P

C O N S I S T E N T H A S H I N G
6 9

• Map each object to a point on the edge of a circle
• Map each machine to a pseudo-random point on the same
circle
• To find the node on which an object is stored, find the
location of the object on the edge of the circle and walk
around the circle until the first node is found
7 0

7 1

FA I L U R E D E T E C T I O N
7 2

• Failure detectors are characterized in terms of completeness and
accuracy
• In a synchronous system, failure detection is solvable
• Certain problems are not solvable without failure detection in an
asynchronous system
• A partitioned process is indistinguishable from a crashed process
• Thus reliable failure detection is impossible in an asynchronous system
• Failure detection is usually based on time
FA I L U R E D E T E C T I O N
7 3

L E A D E R E L E C T I O N
7 4

7 5
• The process of selecting a single node to coordinate a
cluster
• Difﬁcult to account for failures
• Electing a leader allows a single process to control a
cluster
• Frequently used in consensus algorithms
• But a single leader can limit throughput

7 6
B U L LY A L G O R I T H M

• Single-system view, shared state
• Key to building consistent storage systems
C O N S E N S U S
7 8

• Agreement — every correct process must agree on the
same value
• Integrity — every correct process decides at most one
proposed value
• Termination — all processes eventually reach some value
• Validity — if all correct processes propose the same value v
then all processes decide the same value v
C O N S E N S U S
7 9

• “Impossibility of Consensus with One Faulty Process” —
Fischer, Lynch, and Paterson
• Commonly referred to as the FLP Impossibility Result
• Consensus is impossible to guarantee in a fault-tolerant
asynchronous system
• In practice, consensus can be reached
C O N S E N S U S
8 0

ZooKeeper Atomic
Broadcast
“Wait-free Coordination for Internet
Scale Systems” — Hunt, Konar et al
Viewstamped Replication
“Viewstamped Replication” — Brian
M. Oki and Barbara H. Liskov
Raft
“In Search of an Understandable
Consensus Algorithm” — Diego
Ongaro and John Osterhout
C O N S E N S U S
8 1
Paxos
“The Part-Time Parliament” — Leslie
Lamport
“Paxos Made Easy” — Leslie Lamport

• Leader election
• Log replication
• Failure detection
• Log compaction
• Membership changes
C O N S E N S U S
8 2

Distributed systems in practice
N E X T T I M E
8 3

Halterman
Jordan
T H A N K Y O U !

Distributed Systems Concepts

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Distributed Systems Concepts

Similar to Distributed Systems Concepts (20)

Recently uploaded

Recently uploaded (20)

Distributed Systems Concepts