SlideShare a Scribd company logo
1 of 35
Download to read offline
Distributed Systems + NodeJS
Bruno Bossola
MILAN 25-26 NOVEMBER 2016
@bbossola
@bbossola
Whoami
● Developer since 1988
● XP Coach 2000+
● Co-founder of JUG Torino
● Java Champion since 2005
● CTO @ EF (Education First)
I live in London, love the weather...
@bbossola
Agenda
● Distributed programming
● How does it work, what does it mean
● The CAP theorem
● CAP explained with code
– CA system using two phase commit
– AP system using sloppy quorums
– CP system using majority quorums
● What next?
● Q&A
@bbossola
Distributed programming
● Do we need it?
@bbossola
Distributed programming
● Any system should deal with two tasks:
– Storage
– Computation
● How do we deal with scale?
● How do we use multiple computers to do what we used to
do on one?
@bbossola
What do we want to achieve?
● Scalability
● Availability
● Consistency
@bbossola
Scalability
● The ability of a system/network/process to:
– handle a growing amount of work
– be enlarged to accommodate new growth
A scalable system continue to meet the needs of its users as the
scale increase
clipart courtesy of openclipart.org
clipart courtesy of openclipart.org
@bbossola
Scalability flavours
● size:
– more nodes, more speed
– more nodes, more space
– more data, same latency
● geographic:
– more data centers, quicker response
● administrative:
– more machines, no additional work
@bbossola
How do we scale? partitioning
● Slice the dataset into smaller independent sets
● reduces the impact of dataset growth
– improves performance by limiting the amount of data to
be examined
– improves availability by the ability of partitions to fail
indipendently
@bbossola
How do we scale? partitioning
● But can also be a source of problems
– what happens if a partition become unavailable?
– what if It becomes slower?
– what if it becomes unresponsive?
clipart courtesy of openclipart.org
@bbossola
How do we scale? replication
● Copies of the same data on multiple machines
● Benefits:
– allows more servers to take part in the computation
– improves performance by making additional computing
power and bandwidth
– improves availability by creating copy of the data
@bbossola
How do we scale? replication
● But it's also a source of problems
– there are independent copies of the data
– need to be kept in sync on multiple machines
● Your system must follow a consistency model
v4 v4
v8
v8 v4 v5
v7
v8
clipart courtesy of openclipart.org
@bbossola
Availability
● The proportion of time a system is in functioning conditions
● The system is fault-tolerant
– the ability of your system to behave in a well defined
manner once a fault occurs
● All clients can always read and write
– In distributed systems this
is achieved by redundancy
clipart courtesy of openclipart.org
@bbossola
Introducing: performance
● The amount of useful work accomplished compared to the
time and resources used
● Basically:
– short response time for a unit of work
– high rate of processing
– low utilization of resources
clipart courtesy of openclipart.org
@bbossola
Introducing: latency
● The period between the initiation of something and the
occurrence
● The time between something happened and the time it has
an impact or become visible
● more high level examples:
– how long until you become a zombie
after a bite?
– how long until my post is visible
to others?
clipart courtesy of cliparts.co
@bbossola
Consistency
● Any read on a data item X returns a value corresponding
to the result of the most recent write on X.
● Each client always has the same view of the data
● Also know as “Strong Consistency”
clipart courtesy of cliparts.co
@bbossola
Consistency flavours
● Strong consistency
– every replica sees every update in the same order.
– no two replicas may have different values at the same
time.
● Weak consistency
– every replica will see every update, but possibly in
different orders.
● Eventual consistency
– every replica will eventually see every update and will
eventually agree on all values.
@bbossola
The CAP theorem
CONSISTENCY AVAILABILITY
PARTITION
TOLERANCE
@bbossola
The CAP theorem
● You cannot have all :(
● You can select two
properties at once
Sorry, this has been mathematically proven and no, has not been debunked.
@bbossola
The CAP theorem
CA systems!
● You selected consistency
and availability!
● Strict quorum protocols
(two/multi phase commit)
● Most RDBMS
Hey! A network partition will
f**k you up good!
@bbossola
The CAP theorem
AP systems!
● You selected availability
and partition tolerance!
● Sloppy quorums and
conflict resolution protocols
● Amazon Dynamo, Riak,
Cassandra
@bbossola
The CAP theorem
CP systems!
● You selected consistency
and partition tolerance!
● Majority quorum protocols
(paxos, raft, zab)
● Apache Zookeeper,
Google Spanner
@bbossola
NodeJS time!
● Let's write our brand new key value store
● We will code all three different flavours
● We will have many nodes, fully replicated
● No sharding
● We will kill servers!
● We will trigger network
partitions!
– (no worries. it's a simulation!)
clipart courtesy of cliparts.co
@bbossola
Node APP
General design
<proto>
APIStorage
API
GET (k) SET (k,v)
<proto>
Storage
Database
<proto>
Core
fX fY fZ fK
@bbossola
CA key-value store
● Uses classic two-phase commit
● Works like a local system
● Not partition tolerant
@bbossola
Nodeapp
CA: two phase commit, simplified
2PC
API
Storage
API
GET (k) SET (k,v)
Storage
Database
2PC
Core
propose
(tx)
commit
(tx)
rollback
(tx)
@bbossola
AP key-value store
● Eventually consistent design
● Prioritizes availability over consistency
@bbossola
Nodeapp`
AP: sloppy quorums, simplified
QUORUM
API
Storage
API
GET (k) SET (k,v)
Storage
Database
QUORUM
Core
(read) (repair)
propose
(tx)
commit
(tx)
rollback
(tx)
@bbossola
CP key-value store
● Uses majority quorum (raft)
● Guarantees eventual consistency
@bbossola
CP: majority quorums (raft, simplified)
RAFT
API
Storage
API
GET (k) SET (k,v)
Storage
Database
RAFT
Core
beat
voteme history
Nodeapp`
Urgently needs
refactoring!!!!
@bbossola
What about BASE?
● It's just a way to qualify eventually consistent systems
● BAsic Availability
– The database appears to work most of the time.
● Soft-state
– Stores don’t have to be write-consistent, nor do different
replicas have to be mutually consistent all the time.
● Eventual consistency
– Stores exhibit consistency at some later point (e.g.,
lazily at read time).
@bbossola
What about Lamport clocks?
● It's a mechanism to maintain a distributed notion of time
● Each process maintains a counter
– Whenever a process does work, increment the counter
– Whenever a process sends a message, include the
counter
– When a message is received, set the counter to
max(local_counter, received_counter) + 1
clipart courtesy of cliparts.co
@bbossola
What about Vector clocks?
● Maintains an array of N Lamport clocks, one per each
node
● Whenever a process does work, increment the logical
clock value of the node in the vector
● Whenever a process sends a message, include the full
vector
● When a message is received:
– update each element in
● max(local, received)
– increment the logical clock
– of the current node in the vector
clipart courtesy of cliparts.co
@bbossola
What next?
● Learn the lingo and the basics
● Do your homework
● Start playing with these concepts
● It's complicated, but not rocket science
● Be inspired!
@bbossola
Q&A
Amazon Dynamo:
http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html
The RAFT consensus algorithm:
https://raft.github.io/
http://thesecretlivesofdata.com/raft/
The code used into this presentation:
https://github.com/bbossola/sysdist
clipart courtesy of cliparts.co

More Related Content

What's hot

Discover Kafka on OpenShift: Processing Real-Time Financial Events at Scale (...
Discover Kafka on OpenShift: Processing Real-Time Financial Events at Scale (...Discover Kafka on OpenShift: Processing Real-Time Financial Events at Scale (...
Discover Kafka on OpenShift: Processing Real-Time Financial Events at Scale (...
confluent
 

What's hot (20)

Is It Fast? : Measuring MongoDB Performance
Is It Fast? : Measuring MongoDB PerformanceIs It Fast? : Measuring MongoDB Performance
Is It Fast? : Measuring MongoDB Performance
 
Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)
Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)
Streaming Replication (Keynote @ PostgreSQL Conference 2009 Japan)
 
Deploying Kafka at Dropbox, Mark Smith, Sean Fellows
Deploying Kafka at Dropbox, Mark Smith, Sean FellowsDeploying Kafka at Dropbox, Mark Smith, Sean Fellows
Deploying Kafka at Dropbox, Mark Smith, Sean Fellows
 
Apache Kafka - Martin Podval
Apache Kafka - Martin PodvalApache Kafka - Martin Podval
Apache Kafka - Martin Podval
 
#NetflixEverywhere Global Architecture
#NetflixEverywhere Global Architecture#NetflixEverywhere Global Architecture
#NetflixEverywhere Global Architecture
 
EVCache at Netflix
EVCache at NetflixEVCache at Netflix
EVCache at Netflix
 
ES & Kafka
ES & KafkaES & Kafka
ES & Kafka
 
Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014
Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014
Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014
 
Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...
Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...
Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...
 
Multiply like rabbits with rabbit mq
Multiply like rabbits with rabbit mqMultiply like rabbits with rabbit mq
Multiply like rabbits with rabbit mq
 
Migrating applications to serverless Apache Kafka + KSQL
Migrating applications to serverless Apache Kafka + KSQLMigrating applications to serverless Apache Kafka + KSQL
Migrating applications to serverless Apache Kafka + KSQL
 
Deep dive into Apache Kafka consumption
Deep dive into Apache Kafka consumptionDeep dive into Apache Kafka consumption
Deep dive into Apache Kafka consumption
 
On Rabbits and Elephants
On Rabbits and ElephantsOn Rabbits and Elephants
On Rabbits and Elephants
 
A SOA approximation on symfony
A SOA approximation on symfonyA SOA approximation on symfony
A SOA approximation on symfony
 
Application Caching: The Hidden Microservice
Application Caching: The Hidden MicroserviceApplication Caching: The Hidden Microservice
Application Caching: The Hidden Microservice
 
Rabbitmq & Postgresql
Rabbitmq & PostgresqlRabbitmq & Postgresql
Rabbitmq & Postgresql
 
Discover Kafka on OpenShift: Processing Real-Time Financial Events at Scale (...
Discover Kafka on OpenShift: Processing Real-Time Financial Events at Scale (...Discover Kafka on OpenShift: Processing Real-Time Financial Events at Scale (...
Discover Kafka on OpenShift: Processing Real-Time Financial Events at Scale (...
 
Connecting kafka message systems with scylla
Connecting kafka message systems with scylla   Connecting kafka message systems with scylla
Connecting kafka message systems with scylla
 
How to build an event driven architecture with kafka and kafka connect
How to build an event driven architecture with kafka and kafka connectHow to build an event driven architecture with kafka and kafka connect
How to build an event driven architecture with kafka and kafka connect
 
Messaging Standards and Systems - AMQP & RabbitMQ
Messaging Standards and Systems - AMQP & RabbitMQMessaging Standards and Systems - AMQP & RabbitMQ
Messaging Standards and Systems - AMQP & RabbitMQ
 

Viewers also liked

Viewers also liked (20)

Milano Chatbots Meetup - Vittorio Banfi - Bot Design - Codemotion Milan 2016
Milano Chatbots Meetup - Vittorio Banfi - Bot Design - Codemotion Milan 2016 Milano Chatbots Meetup - Vittorio Banfi - Bot Design - Codemotion Milan 2016
Milano Chatbots Meetup - Vittorio Banfi - Bot Design - Codemotion Milan 2016
 
DevOps in Cloud, dai Container all'approccio Codeless - Gabriele Provinciali,...
DevOps in Cloud, dai Container all'approccio Codeless - Gabriele Provinciali,...DevOps in Cloud, dai Container all'approccio Codeless - Gabriele Provinciali,...
DevOps in Cloud, dai Container all'approccio Codeless - Gabriele Provinciali,...
 
Elixir and Lambda talk with a Telegram bot - Paolo Montrasio - Codemotion Mil...
Elixir and Lambda talk with a Telegram bot - Paolo Montrasio - Codemotion Mil...Elixir and Lambda talk with a Telegram bot - Paolo Montrasio - Codemotion Mil...
Elixir and Lambda talk with a Telegram bot - Paolo Montrasio - Codemotion Mil...
 
Games of Simplicity - Pozzi; Molinari - Codemotion Milan 2016
Games of Simplicity - Pozzi; Molinari - Codemotion Milan 2016Games of Simplicity - Pozzi; Molinari - Codemotion Milan 2016
Games of Simplicity - Pozzi; Molinari - Codemotion Milan 2016
 
Come rendere il proprio prodotto una bomba creandogli una intera community in...
Come rendere il proprio prodotto una bomba creandogli una intera community in...Come rendere il proprio prodotto una bomba creandogli una intera community in...
Come rendere il proprio prodotto una bomba creandogli una intera community in...
 
Understanding Angular 2 - Shmuela Jacobs - Codemotion Milan 2016
Understanding Angular 2 - Shmuela Jacobs - Codemotion Milan 2016Understanding Angular 2 - Shmuela Jacobs - Codemotion Milan 2016
Understanding Angular 2 - Shmuela Jacobs - Codemotion Milan 2016
 
How To Structure Go Applications - Paul Bellamy - Codemotion Milan 2016
How To Structure Go Applications - Paul Bellamy - Codemotion Milan 2016How To Structure Go Applications - Paul Bellamy - Codemotion Milan 2016
How To Structure Go Applications - Paul Bellamy - Codemotion Milan 2016
 
Keynote: Community Innovation Alaina Percival - Codemotion Milan 2016
Keynote: Community Innovation Alaina Percival - Codemotion Milan 2016Keynote: Community Innovation Alaina Percival - Codemotion Milan 2016
Keynote: Community Innovation Alaina Percival - Codemotion Milan 2016
 
Keynote: The Most Important Thing - Mike Lee - Codemotion Milan 2016
Keynote: The Most Important Thing - Mike Lee - Codemotion Milan 2016Keynote: The Most Important Thing - Mike Lee - Codemotion Milan 2016
Keynote: The Most Important Thing - Mike Lee - Codemotion Milan 2016
 
Universal JavaScript Web Applications with React - Luciano Mammino - Codemoti...
Universal JavaScript Web Applications with React - Luciano Mammino - Codemoti...Universal JavaScript Web Applications with React - Luciano Mammino - Codemoti...
Universal JavaScript Web Applications with React - Luciano Mammino - Codemoti...
 
Big Data, Small Dashboard - Andrea Maietta - Codemotion Milan 2016
Big Data, Small Dashboard - Andrea Maietta - Codemotion Milan 2016Big Data, Small Dashboard - Andrea Maietta - Codemotion Milan 2016
Big Data, Small Dashboard - Andrea Maietta - Codemotion Milan 2016
 
Pronti per la legge sulla data protection GDPR? No Panic! - Stefano Sali, Dom...
Pronti per la legge sulla data protection GDPR? No Panic! - Stefano Sali, Dom...Pronti per la legge sulla data protection GDPR? No Panic! - Stefano Sali, Dom...
Pronti per la legge sulla data protection GDPR? No Panic! - Stefano Sali, Dom...
 
Public speaking 4 geeks - Lorenzo Barbieri - Codemotion Milan 2016
Public speaking 4 geeks - Lorenzo Barbieri - Codemotion Milan 2016Public speaking 4 geeks - Lorenzo Barbieri - Codemotion Milan 2016
Public speaking 4 geeks - Lorenzo Barbieri - Codemotion Milan 2016
 
Microservices done right or SOA lessons learnt - Sean Farmar - Codemotion Mil...
Microservices done right or SOA lessons learnt - Sean Farmar - Codemotion Mil...Microservices done right or SOA lessons learnt - Sean Farmar - Codemotion Mil...
Microservices done right or SOA lessons learnt - Sean Farmar - Codemotion Mil...
 
Cyber Analysts: who they are, what they do, where they are - Marco Ramilli - ...
Cyber Analysts: who they are, what they do, where they are - Marco Ramilli - ...Cyber Analysts: who they are, what they do, where they are - Marco Ramilli - ...
Cyber Analysts: who they are, what they do, where they are - Marco Ramilli - ...
 
Lo sviluppo di Edge Guardian VR - Maurizio Tatafiore - Codemotion Milan 2016
Lo sviluppo di Edge Guardian VR - Maurizio Tatafiore - Codemotion Milan 2016Lo sviluppo di Edge Guardian VR - Maurizio Tatafiore - Codemotion Milan 2016
Lo sviluppo di Edge Guardian VR - Maurizio Tatafiore - Codemotion Milan 2016
 
Master the chaos: from raw data to analytics - Andrea Pompili, Riccardo Rossi...
Master the chaos: from raw data to analytics - Andrea Pompili, Riccardo Rossi...Master the chaos: from raw data to analytics - Andrea Pompili, Riccardo Rossi...
Master the chaos: from raw data to analytics - Andrea Pompili, Riccardo Rossi...
 
Time to market: when a worse game is better - Mattia Traverso - Codemotion Mi...
Time to market: when a worse game is better - Mattia Traverso - Codemotion Mi...Time to market: when a worse game is better - Mattia Traverso - Codemotion Mi...
Time to market: when a worse game is better - Mattia Traverso - Codemotion Mi...
 
From a developer to a teamleader — an unexpected journey - Vitaly Sharovatov ...
From a developer to a teamleader — an unexpected journey - Vitaly Sharovatov ...From a developer to a teamleader — an unexpected journey - Vitaly Sharovatov ...
From a developer to a teamleader — an unexpected journey - Vitaly Sharovatov ...
 
Situational Awareness, Botnet and Malware Detection in the Modern Era - Davi...
Situational Awareness, Botnet and Malware Detection in the Modern Era  - Davi...Situational Awareness, Botnet and Malware Detection in the Modern Era  - Davi...
Situational Awareness, Botnet and Malware Detection in the Modern Era - Davi...
 

Similar to Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016

Similar to Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016 (20)

Distributed Systems explained (with NodeJS) - Bruno Bossola, JUG Torino
Distributed Systems explained (with NodeJS) - Bruno Bossola, JUG TorinoDistributed Systems explained (with NodeJS) - Bruno Bossola, JUG Torino
Distributed Systems explained (with NodeJS) - Bruno Bossola, JUG Torino
 
PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
PHP At 5000 Requests Per Second: Hootsuite’s Scaling StoryPHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
 
Event driven architectures with Kinesis
Event driven architectures with KinesisEvent driven architectures with Kinesis
Event driven architectures with Kinesis
 
Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?
 
Voxxed Athens 2018 - Methods and Practices for Guaranteed Failure in Big Data
Voxxed Athens 2018 - Methods and Practices for Guaranteed Failure in Big DataVoxxed Athens 2018 - Methods and Practices for Guaranteed Failure in Big Data
Voxxed Athens 2018 - Methods and Practices for Guaranteed Failure in Big Data
 
Concurrency, Parallelism And IO
Concurrency,  Parallelism And IOConcurrency,  Parallelism And IO
Concurrency, Parallelism And IO
 
NoSQL Evolution
NoSQL EvolutionNoSQL Evolution
NoSQL Evolution
 
Lessons from managing a Pulsar cluster (Nutanix)
Lessons from managing a Pulsar cluster (Nutanix)Lessons from managing a Pulsar cluster (Nutanix)
Lessons from managing a Pulsar cluster (Nutanix)
 
lessons from managing a pulsar cluster
 lessons from managing a pulsar cluster lessons from managing a pulsar cluster
lessons from managing a pulsar cluster
 
Couchbase live 2016
Couchbase live 2016Couchbase live 2016
Couchbase live 2016
 
Ceph at Work in Bloomberg: Object Store, RBD and OpenStack
Ceph at Work in Bloomberg: Object Store, RBD and OpenStackCeph at Work in Bloomberg: Object Store, RBD and OpenStack
Ceph at Work in Bloomberg: Object Store, RBD and OpenStack
 
Software Design for Persistent Memory Systems
Software Design for Persistent Memory SystemsSoftware Design for Persistent Memory Systems
Software Design for Persistent Memory Systems
 
071410 sun a_1515_feldman_stephen
071410 sun a_1515_feldman_stephen071410 sun a_1515_feldman_stephen
071410 sun a_1515_feldman_stephen
 
Distributed Tensorflow with Kubernetes - data2day - Jakob Karalus
Distributed Tensorflow with Kubernetes - data2day - Jakob KaralusDistributed Tensorflow with Kubernetes - data2day - Jakob Karalus
Distributed Tensorflow with Kubernetes - data2day - Jakob Karalus
 
PyCon HK 2018 - Heterogeneous job processing with Apache Kafka
PyCon HK 2018 - Heterogeneous job processing with Apache Kafka PyCon HK 2018 - Heterogeneous job processing with Apache Kafka
PyCon HK 2018 - Heterogeneous job processing with Apache Kafka
 
Bootstrapping state in Apache Flink
Bootstrapping state in Apache FlinkBootstrapping state in Apache Flink
Bootstrapping state in Apache Flink
 
Scaling Redis Cluster Deployments for Genome Analysis (featuring LSU) - Terry...
Scaling Redis Cluster Deployments for Genome Analysis (featuring LSU) - Terry...Scaling Redis Cluster Deployments for Genome Analysis (featuring LSU) - Terry...
Scaling Redis Cluster Deployments for Genome Analysis (featuring LSU) - Terry...
 
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInJay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
 

More from Codemotion

More from Codemotion (20)

Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...
Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...
Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...
 
Pompili - From hero to_zero: The FatalNoise neverending story
Pompili - From hero to_zero: The FatalNoise neverending storyPompili - From hero to_zero: The FatalNoise neverending story
Pompili - From hero to_zero: The FatalNoise neverending story
 
Pastore - Commodore 65 - La storia
Pastore - Commodore 65 - La storiaPastore - Commodore 65 - La storia
Pastore - Commodore 65 - La storia
 
Pennisi - Essere Richard Altwasser
Pennisi - Essere Richard AltwasserPennisi - Essere Richard Altwasser
Pennisi - Essere Richard Altwasser
 
Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...
Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...
Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...
 
Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019
Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019
Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019
 
Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019
Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019
Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019
 
Francesco Baldassarri - Deliver Data at Scale - Codemotion Amsterdam 2019 -
Francesco Baldassarri  - Deliver Data at Scale - Codemotion Amsterdam 2019 - Francesco Baldassarri  - Deliver Data at Scale - Codemotion Amsterdam 2019 -
Francesco Baldassarri - Deliver Data at Scale - Codemotion Amsterdam 2019 -
 
Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...
Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...
Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...
 
Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...
Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...
Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...
 
Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...
Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...
Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...
 
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
 
Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019
Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019
Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019
 
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019
 
Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019
Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019
Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019
 
James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...
James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...
James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...
 
Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...
Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...
Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...
 
Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019
Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019
Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019
 
Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019
Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019
Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019
 
Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019
Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019
Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 

Distributed System explained (with NodeJS) - Bruno Bossola - Codemotion Milan 2016

  • 1. Distributed Systems + NodeJS Bruno Bossola MILAN 25-26 NOVEMBER 2016 @bbossola
  • 2. @bbossola Whoami ● Developer since 1988 ● XP Coach 2000+ ● Co-founder of JUG Torino ● Java Champion since 2005 ● CTO @ EF (Education First) I live in London, love the weather...
  • 3. @bbossola Agenda ● Distributed programming ● How does it work, what does it mean ● The CAP theorem ● CAP explained with code – CA system using two phase commit – AP system using sloppy quorums – CP system using majority quorums ● What next? ● Q&A
  • 5. @bbossola Distributed programming ● Any system should deal with two tasks: – Storage – Computation ● How do we deal with scale? ● How do we use multiple computers to do what we used to do on one?
  • 6. @bbossola What do we want to achieve? ● Scalability ● Availability ● Consistency
  • 7. @bbossola Scalability ● The ability of a system/network/process to: – handle a growing amount of work – be enlarged to accommodate new growth A scalable system continue to meet the needs of its users as the scale increase clipart courtesy of openclipart.org clipart courtesy of openclipart.org
  • 8. @bbossola Scalability flavours ● size: – more nodes, more speed – more nodes, more space – more data, same latency ● geographic: – more data centers, quicker response ● administrative: – more machines, no additional work
  • 9. @bbossola How do we scale? partitioning ● Slice the dataset into smaller independent sets ● reduces the impact of dataset growth – improves performance by limiting the amount of data to be examined – improves availability by the ability of partitions to fail indipendently
  • 10. @bbossola How do we scale? partitioning ● But can also be a source of problems – what happens if a partition become unavailable? – what if It becomes slower? – what if it becomes unresponsive? clipart courtesy of openclipart.org
  • 11. @bbossola How do we scale? replication ● Copies of the same data on multiple machines ● Benefits: – allows more servers to take part in the computation – improves performance by making additional computing power and bandwidth – improves availability by creating copy of the data
  • 12. @bbossola How do we scale? replication ● But it's also a source of problems – there are independent copies of the data – need to be kept in sync on multiple machines ● Your system must follow a consistency model v4 v4 v8 v8 v4 v5 v7 v8 clipart courtesy of openclipart.org
  • 13. @bbossola Availability ● The proportion of time a system is in functioning conditions ● The system is fault-tolerant – the ability of your system to behave in a well defined manner once a fault occurs ● All clients can always read and write – In distributed systems this is achieved by redundancy clipart courtesy of openclipart.org
  • 14. @bbossola Introducing: performance ● The amount of useful work accomplished compared to the time and resources used ● Basically: – short response time for a unit of work – high rate of processing – low utilization of resources clipart courtesy of openclipart.org
  • 15. @bbossola Introducing: latency ● The period between the initiation of something and the occurrence ● The time between something happened and the time it has an impact or become visible ● more high level examples: – how long until you become a zombie after a bite? – how long until my post is visible to others? clipart courtesy of cliparts.co
  • 16. @bbossola Consistency ● Any read on a data item X returns a value corresponding to the result of the most recent write on X. ● Each client always has the same view of the data ● Also know as “Strong Consistency” clipart courtesy of cliparts.co
  • 17. @bbossola Consistency flavours ● Strong consistency – every replica sees every update in the same order. – no two replicas may have different values at the same time. ● Weak consistency – every replica will see every update, but possibly in different orders. ● Eventual consistency – every replica will eventually see every update and will eventually agree on all values.
  • 18. @bbossola The CAP theorem CONSISTENCY AVAILABILITY PARTITION TOLERANCE
  • 19. @bbossola The CAP theorem ● You cannot have all :( ● You can select two properties at once Sorry, this has been mathematically proven and no, has not been debunked.
  • 20. @bbossola The CAP theorem CA systems! ● You selected consistency and availability! ● Strict quorum protocols (two/multi phase commit) ● Most RDBMS Hey! A network partition will f**k you up good!
  • 21. @bbossola The CAP theorem AP systems! ● You selected availability and partition tolerance! ● Sloppy quorums and conflict resolution protocols ● Amazon Dynamo, Riak, Cassandra
  • 22. @bbossola The CAP theorem CP systems! ● You selected consistency and partition tolerance! ● Majority quorum protocols (paxos, raft, zab) ● Apache Zookeeper, Google Spanner
  • 23. @bbossola NodeJS time! ● Let's write our brand new key value store ● We will code all three different flavours ● We will have many nodes, fully replicated ● No sharding ● We will kill servers! ● We will trigger network partitions! – (no worries. it's a simulation!) clipart courtesy of cliparts.co
  • 24. @bbossola Node APP General design <proto> APIStorage API GET (k) SET (k,v) <proto> Storage Database <proto> Core fX fY fZ fK
  • 25. @bbossola CA key-value store ● Uses classic two-phase commit ● Works like a local system ● Not partition tolerant
  • 26. @bbossola Nodeapp CA: two phase commit, simplified 2PC API Storage API GET (k) SET (k,v) Storage Database 2PC Core propose (tx) commit (tx) rollback (tx)
  • 27. @bbossola AP key-value store ● Eventually consistent design ● Prioritizes availability over consistency
  • 28. @bbossola Nodeapp` AP: sloppy quorums, simplified QUORUM API Storage API GET (k) SET (k,v) Storage Database QUORUM Core (read) (repair) propose (tx) commit (tx) rollback (tx)
  • 29. @bbossola CP key-value store ● Uses majority quorum (raft) ● Guarantees eventual consistency
  • 30. @bbossola CP: majority quorums (raft, simplified) RAFT API Storage API GET (k) SET (k,v) Storage Database RAFT Core beat voteme history Nodeapp` Urgently needs refactoring!!!!
  • 31. @bbossola What about BASE? ● It's just a way to qualify eventually consistent systems ● BAsic Availability – The database appears to work most of the time. ● Soft-state – Stores don’t have to be write-consistent, nor do different replicas have to be mutually consistent all the time. ● Eventual consistency – Stores exhibit consistency at some later point (e.g., lazily at read time).
  • 32. @bbossola What about Lamport clocks? ● It's a mechanism to maintain a distributed notion of time ● Each process maintains a counter – Whenever a process does work, increment the counter – Whenever a process sends a message, include the counter – When a message is received, set the counter to max(local_counter, received_counter) + 1 clipart courtesy of cliparts.co
  • 33. @bbossola What about Vector clocks? ● Maintains an array of N Lamport clocks, one per each node ● Whenever a process does work, increment the logical clock value of the node in the vector ● Whenever a process sends a message, include the full vector ● When a message is received: – update each element in ● max(local, received) – increment the logical clock – of the current node in the vector clipart courtesy of cliparts.co
  • 34. @bbossola What next? ● Learn the lingo and the basics ● Do your homework ● Start playing with these concepts ● It's complicated, but not rocket science ● Be inspired!
  • 35. @bbossola Q&A Amazon Dynamo: http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html The RAFT consensus algorithm: https://raft.github.io/ http://thesecretlivesofdata.com/raft/ The code used into this presentation: https://github.com/bbossola/sysdist clipart courtesy of cliparts.co