Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Distributed Systems + NodeJS
Bruno Bossola
MILAN 25-26 NOVEMBER 2016
@bbossola
@bbossola
Whoami
● Developer since 1988
● XP Coach 2000+
● Co-founder of JUG Torino
● Java Champion since 2005
● CTO @ EF ...
@bbossola
Agenda
● Distributed programming
● How does it work, what does it mean
● The CAP theorem
● CAP explained with co...
@bbossola
Distributed programming
● Do we need it?
@bbossola
Distributed programming
● Any system should deal with two tasks:
– Storage
– Computation
● How do we deal with s...
@bbossola
What do we want to achieve?
● Scalability
● Availability
● Consistency
@bbossola
Scalability
● The ability of a system/network/process to:
– handle a growing amount of work
– be enlarged to acc...
@bbossola
Scalability flavours
● size:
– more nodes, more speed
– more nodes, more space
– more data, same latency
● geogr...
@bbossola
How do we scale? partitioning
● Slice the dataset into smaller independent sets
● reduces the impact of dataset ...
@bbossola
How do we scale? partitioning
● But can also be a source of problems
– what happens if a partition become unavai...
@bbossola
How do we scale? replication
● Copies of the same data on multiple machines
● Benefits:
– allows more servers to...
@bbossola
How do we scale? replication
● But it's also a source of problems
– there are independent copies of the data
– n...
@bbossola
Availability
● The proportion of time a system is in functioning conditions
● The system is fault-tolerant
– the...
@bbossola
Introducing: performance
● The amount of useful work accomplished compared to the
time and resources used
● Basi...
@bbossola
Introducing: latency
● The period between the initiation of something and the
occurrence
● The time between some...
@bbossola
Consistency
● Any read on a data item X returns a value corresponding
to the result of the most recent write on ...
@bbossola
Consistency flavours
● Strong consistency
– every replica sees every update in the same order.
– no two replicas...
@bbossola
The CAP theorem
CONSISTENCY AVAILABILITY
PARTITION
TOLERANCE
@bbossola
The CAP theorem
● You cannot have all :(
● You can select two
properties at once
Sorry, this has been mathematic...
@bbossola
The CAP theorem
CA systems!
● You selected consistency
and availability!
● Strict quorum protocols
(two/multi ph...
@bbossola
The CAP theorem
AP systems!
● You selected availability
and partition tolerance!
● Sloppy quorums and
conflict r...
@bbossola
The CAP theorem
CP systems!
● You selected consistency
and partition tolerance!
● Majority quorum protocols
(pax...
@bbossola
NodeJS time!
● Let's write our brand new key value store
● We will code all three different flavours
● We will h...
@bbossola
Node APP
General design
<proto>
APIStorage
API
GET (k) SET (k,v)
<proto>
Storage
Database
<proto>
Core
fX fY fZ ...
@bbossola
CA key-value store
● Uses classic two-phase commit
● Works like a local system
● Not partition tolerant
@bbossola
Nodeapp
CA: two phase commit, simplified
2PC
API
Storage
API
GET (k) SET (k,v)
Storage
Database
2PC
Core
propose...
@bbossola
AP key-value store
● Eventually consistent design
● Prioritizes availability over consistency
@bbossola
Nodeapp`
AP: sloppy quorums, simplified
QUORUM
API
Storage
API
GET (k) SET (k,v)
Storage
Database
QUORUM
Core
(r...
@bbossola
CP key-value store
● Uses majority quorum (raft)
● Guarantees eventual consistency
@bbossola
CP: majority quorums (raft, simplified)
RAFT
API
Storage
API
GET (k) SET (k,v)
Storage
Database
RAFT
Core
beat
v...
@bbossola
What about BASE?
● It's just a way to qualify eventually consistent systems
● BAsic Availability
– The database ...
@bbossola
What about Lamport clocks?
● It's a mechanism to maintain a distributed notion of time
● Each process maintains ...
@bbossola
What about Vector clocks?
● Maintains an array of N Lamport clocks, one per each node
● Whenever a process does ...
@bbossola
What next?
● Learn the lingo and the basics
● Do your homework
● Start playing with these concepts
● It's compli...
@bbossola
Q&A
Amazon Dynamo:
http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html
The RAFT consensus algorithm:...
Upcoming SlideShare
Loading in …5
×

Distributed Systems

1,727 views

Published on

A simple explanation of basic principles of Distributed Programming with NodeJS. The CAP Theorem is fully explained, with working code the you can try yourself!

Published in: Software

Distributed Systems

  1. 1. Distributed Systems + NodeJS Bruno Bossola MILAN 25-26 NOVEMBER 2016 @bbossola
  2. 2. @bbossola Whoami ● Developer since 1988 ● XP Coach 2000+ ● Co-founder of JUG Torino ● Java Champion since 2005 ● CTO @ EF (Education First) I live in London, love the weather...
  3. 3. @bbossola Agenda ● Distributed programming ● How does it work, what does it mean ● The CAP theorem ● CAP explained with code – CA system using two phase commit – AP system using sloppy quorums – CP system using majority quorums ● What next? ● Q&A
  4. 4. @bbossola Distributed programming ● Do we need it?
  5. 5. @bbossola Distributed programming ● Any system should deal with two tasks: – Storage – Computation ● How do we deal with scale? ● How do we use multiple computers to do what we used to do on one?
  6. 6. @bbossola What do we want to achieve? ● Scalability ● Availability ● Consistency
  7. 7. @bbossola Scalability ● The ability of a system/network/process to: – handle a growing amount of work – be enlarged to accommodate new growth A scalable system continue to meet the needs of its users as the scale increase clipart courtesy of openclipart.org clipart courtesy of openclipart.org
  8. 8. @bbossola Scalability flavours ● size: – more nodes, more speed – more nodes, more space – more data, same latency ● geographic: – more data centers, quicker response ● administrative: – more machines, no additional work
  9. 9. @bbossola How do we scale? partitioning ● Slice the dataset into smaller independent sets ● reduces the impact of dataset growth – improves performance by limiting the amount of data to be examined – improves availability by the ability of partitions to fail indipendently
  10. 10. @bbossola How do we scale? partitioning ● But can also be a source of problems – what happens if a partition become unavailable? – what if It becomes slower? – what if it becomes unresponsive? clipart courtesy of openclipart.org
  11. 11. @bbossola How do we scale? replication ● Copies of the same data on multiple machines ● Benefits: – allows more servers to take part in the computation – improves performance by making additional computing power and bandwidth – improves availability by creating copy of the data
  12. 12. @bbossola How do we scale? replication ● But it's also a source of problems – there are independent copies of the data – need to be kept in sync on multiple machines ● Your system must follow a consistency model v4 v4 v8 v8 v4 v5 v7 v8 clipart courtesy of openclipart.org
  13. 13. @bbossola Availability ● The proportion of time a system is in functioning conditions ● The system is fault-tolerant – the ability of your system to behave in a well defined manner once a fault occurs ● All clients can always read and write – In distributed systems this is achieved by redundancy clipart courtesy of openclipart.org
  14. 14. @bbossola Introducing: performance ● The amount of useful work accomplished compared to the time and resources used ● Basically: – short response time for a unit of work – high rate of processing – low utilization of resources clipart courtesy of openclipart.org
  15. 15. @bbossola Introducing: latency ● The period between the initiation of something and the occurrence ● The time between something happened and the time it has an impact or become visible ● more high level examples: – how long until you become a zombie after a bite? – how long until my post is visible to others? clipart courtesy of cliparts.co
  16. 16. @bbossola Consistency ● Any read on a data item X returns a value corresponding to the result of the most recent write on X. ● Each client always has the same view of the data ● Also know as “Strong Consistency” clipart courtesy of cliparts.co
  17. 17. @bbossola Consistency flavours ● Strong consistency – every replica sees every update in the same order. – no two replicas may have different values at the same time. ● Weak consistency – every replica will see every update, but possibly in different orders. ● Eventual consistency – every replica will eventually see every update and will eventually agree on all values.
  18. 18. @bbossola The CAP theorem CONSISTENCY AVAILABILITY PARTITION TOLERANCE
  19. 19. @bbossola The CAP theorem ● You cannot have all :( ● You can select two properties at once Sorry, this has been mathematically proven and no, has not been debunked.
  20. 20. @bbossola The CAP theorem CA systems! ● You selected consistency and availability! ● Strict quorum protocols (two/multi phase commit) ● Most RDBMS Hey! A network partition will f**k you up good!
  21. 21. @bbossola The CAP theorem AP systems! ● You selected availability and partition tolerance! ● Sloppy quorums and conflict resolution protocols ● Amazon Dynamo, Riak, Cassandra
  22. 22. @bbossola The CAP theorem CP systems! ● You selected consistency and partition tolerance! ● Majority quorum protocols (paxos, raft, zab) ● Apache Zookeeper, Google Spanner
  23. 23. @bbossola NodeJS time! ● Let's write our brand new key value store ● We will code all three different flavours ● We will have many nodes, fully replicated ● No sharding ● We will kill servers! ● We will trigger network partitions! – (no worries. it's a simulation!) clipart courtesy of cliparts.co
  24. 24. @bbossola Node APP General design <proto> APIStorage API GET (k) SET (k,v) <proto> Storage Database <proto> Core fX fY fZ fK
  25. 25. @bbossola CA key-value store ● Uses classic two-phase commit ● Works like a local system ● Not partition tolerant
  26. 26. @bbossola Nodeapp CA: two phase commit, simplified 2PC API Storage API GET (k) SET (k,v) Storage Database 2PC Core propose (tx) commit (tx) rollback (tx)
  27. 27. @bbossola AP key-value store ● Eventually consistent design ● Prioritizes availability over consistency
  28. 28. @bbossola Nodeapp` AP: sloppy quorums, simplified QUORUM API Storage API GET (k) SET (k,v) Storage Database QUORUM Core (read) (repair) propose (tx) commit (tx) rollback (tx)
  29. 29. @bbossola CP key-value store ● Uses majority quorum (raft) ● Guarantees eventual consistency
  30. 30. @bbossola CP: majority quorums (raft, simplified) RAFT API Storage API GET (k) SET (k,v) Storage Database RAFT Core beat voteme history Nodeapp` Urgently needs refactoring!!!!
  31. 31. @bbossola What about BASE? ● It's just a way to qualify eventually consistent systems ● BAsic Availability – The database appears to work most of the time. ● Soft-state – Stores don’t have to be write-consistent, nor do different replicas have to be mutually consistent all the time. ● Eventual consistency – Stores exhibit consistency at some later point (e.g., lazily at read time).
  32. 32. @bbossola What about Lamport clocks? ● It's a mechanism to maintain a distributed notion of time ● Each process maintains a counter – Whenever a process does work, increment the counter – Whenever a process sends a message, include the counter – When a message is received, set the counter to max(local_counter, received_counter) + 1 clipart courtesy of cliparts.co
  33. 33. @bbossola What about Vector clocks? ● Maintains an array of N Lamport clocks, one per each node ● Whenever a process does work, increment the logical clock value of the node in the vector ● Whenever a process sends a message, include the full vector ● When a message is received: – update each element in ● max(local, received) – increment the logical clock – of the current node in the vector clipart courtesy of cliparts.co
  34. 34. @bbossola What next? ● Learn the lingo and the basics ● Do your homework ● Start playing with these concepts ● It's complicated, but not rocket science ● Be inspired!
  35. 35. @bbossola Q&A Amazon Dynamo: http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html The RAFT consensus algorithm: https://raft.github.io/ http://thesecretlivesofdata.com/raft/ The code used into this presentation: https://github.com/bbossola/sysdist clipart courtesy of cliparts.co

×