SlideShare a Scribd company logo
DISTRIBUTED 
COMPUTING 
FOR NEW BLOODS 
RAYMOND TAY 
I WORK IN HEWLETT-PACKARD LABS SINGAPORE 
@RAYMONDTAYBL
What is a 
Distributed 
System?
What is a 
Distributed 
System? 
It’s been 
there for 
a long 
time
What is a 
Distributed 
System? 
It’s been 
there for 
a long 
time 
You did not realize it
DISTRIBUTED 
SYSTEMS ARE 
THE NEW 
FRONTIER 
whether YOU like it or not
DISTRIBUTED 
SYSTEMS ARE 
THE NEW 
FRONTIER 
whether YOU like it or not 
LAN Games 
Mobile 
Databases 
ATMs 
Social Media
WHAT IS THE 
ESSENCE OF 
DISTRIBUTED 
SYSTEMS?
WHAT IS THE 
ESSENCE OF 
DISTRIBUTED 
SYSTEMS? 
IT IS ABOUT THE 
ATTEMPT TO 
OVERCOME 
INFORMATION 
TRAVEL 
WHEN 
INDEPENDENT 
PROCESSES FAIL
Information flows 
at speed of light ! 
MESSAGE 
X Y
When 
independent 
things DONT fail. 
I’ve sent it. I’ve received it. 
X Y 
I’ve received it. I’ve sent it.
When independent 
things fail, 
independently. 
X Y 
NETWORK FAILURE
When independent 
things fail, 
independently. 
X Y 
NODE FAILURE
There is no difference 
between a slow node 
and a dead node
Why do we NEED it? 
Scalability 
when the needs of the 
system outgrows what a 
single node can provide
Why do we NEED it? 
Scalability 
when the needs of the 
system outgrows what a 
single node can provide 
Availability 
Enabling resilience 
when a node fails
IS IT REALLY THAT 
HARD?
IS IT REALLY THAT 
HARD? 
The answer lies in 
knowing what is 
possible and what is not 
possible…
IS IT REALLY THAT 
HARD? 
Even widely accepted 
facts are challenged…
THE FALLACIES OF 
DISTRIBUTED COMPUTING 
The network is reliable Peter Deutsch and 
other fellows 
at Sun 
Microsystems
THE FALLACIES OF 
DISTRIBUTED COMPUTING 
The network is reliable
THE FALLACIES OF 
DISTRIBUTED COMPUTING 
The network is reliable 
The network is secure 
Peter Deutsch and 
other fellows 
at Sun 
Microsystems
THE FALLACIES OF 
DISTRIBUTED COMPUTING 
The network is reliable 
The network is secure 
2013 cost of cyber 
crime study: 
United States
THE FALLACIES OF 
DISTRIBUTED COMPUTING 
The network is reliable 
Peter Deutsch and 
The network is secure 
other fellows 
at Sun 
The network is 
Microsystems 
homogeneous
THE FALLACIES OF 
DISTRIBUTED COMPUTING 
The network is reliable 
Peter Deutsch and 
The network is secure 
other fellows 
at Sun 
The network is 
Microsystems 
homogeneous 
Latency is zero 
Ingo Rammer on 
latency vs 
bandwidth
THE FALLACIES OF 
DISTRIBUTED COMPUTING 
The network is reliable 
Peter Deutsch and 
The network is secure 
other fellows 
at Sun 
The network is 
Microsystems 
homogeneous 
Latency is zero 
Bandwidth is infinite
THE FALLACIES OF 
DISTRIBUTED COMPUTING 
The network is reliable 
Peter Deutsch and 
The network is secure 
other fellows 
The network is 
at Sun 
homogeneous 
Microsystems 
Latency is zero 
Bandwidth is infinite 
Topology doesn’t change
THE FALLACIES OF 
DISTRIBUTED COMPUTING 
The network is reliable 
The network is secure 
Peter Deutsch and 
The network is 
other fellows 
homogeneous 
at Sun 
Microsystems 
Latency is zero 
Bandwidth is infinite 
Topology doesn’t change 
There is one administrator
THE FALLACIES OF 
DISTRIBUTED COMPUTING 
The network is reliable 
The network is secure 
Peter Deutsch and 
The network is homogeneous 
other fellows 
at Sun 
Latency is zero 
Microsystems 
Bandwidth is infinite 
Topology doesn’t change 
There is one administrator 
Transport cost is zero
THE FALLACIES OF 
DISTRIBUTED COMPUTING 
The network is reliable 
The network is secure 
Peter Deutsch and 
The network is homogeneous 
other fellows 
at Sun 
Latency is zero 
Microsystems 
Bandwidth is infinite 
Topology doesn’t change 
There is one administrator 
Transport cost is zero
WHAT WAS TRIED IN THE 
PAST? 
DISTRIBUTED OBJECTS 
REMOTE PROCEDURE CALL 
DISTRIBUTED SHARED MUTABLE STATE
WHAT WAS ATTEMPTED? 
DISTRIBUTED OBJECTS 
REMOTE PROCEDURE CALL 
DISTRIBUTED SHARED MUTABLE STATE 
• Typically, programming constructs to “abstract” the fact 
that there are local and distributed objects 
• Ignores latencies 
• Handles failures with this attitude → “i dunno what to 
do…YOU (i.e. invoker) handle it”.
WHAT WAS ATTEMPTED? 
DISTRIBUTED OBJECTS 
REMOTE PROCEDURE CALL 
DISTRIBUTED SHARED MUTABLE STATE 
• Assumes the synchronous processing model 
• Asynchronous RPC tries to model after 
synchronous RPC…
WHAT WAS ATTEMPTED? 
DISTRIBUTED OBJECTS 
REMOTE PROCEDURE CALL 
DISTRIBUTED SHARED MUTABLE STATE 
• Found in Distributed Shared Memory Systems i.e. 
“1” address space partitioned into “x” address spaces for “x” nodes 
• e.g. JavaSpaces ⇒ Danger of 2 independent 
processes to commit successfully.
YOU CAN APPRECIATE 
THAT IT IS 
VERY HARD TO SOLVE 
IN ITS ENTIRETY
The question really 
is “How can we do 
better?”
The question really 
is “How can we do 
better?” 
To start,we 
need to 
understand 2 
results
FLP IMPOSSIBILITY 
RESULT 
The FLP result shows that in 
an asynchronous model, 
where one processor might 
crash, there is no distributed 
algorithm to solve the 
consensus problem.
Consensus is a fundamental problem in fault-tolerant 
distributed systems. Consensus involves multiple servers 
agreeing on values. Once they reach a decision on a value, 
that decision is final. 
Typical consensus algorithms make progress when any 
majority of their servers are available; for example, a cluster 
of 5 servers can continue to operate even if 2 servers fail. If 
more servers fail, they stop making progress (but will never 
return an incorrect result). 
Paxos (1989), Raft (2013)
CAP THEOREM 
CAP CONJECTURE 
(2000) established 
as a theorem in 2002 
Consistency 
Availability 
Partition Tolerance
CAP THEOREM 
CAP CONJECTURE 
(2000) established 
as a theorem in 2002 
Consistency 
Availability 
Partition Tolerance 
• Consistency ⇒ all nodes should see the same data, 
eventually
CAP THEOREM 
CAP CONJECTURE 
(2000) established 
as a theorem in 2002 
Consistency 
Availability 
Partition Tolerance 
• Consistency ⇒ all nodes should see the same data, 
eventually 
• Availability ⇒ System is still working when node(s) 
fails
CAP THEOREM 
CAP CONJECTURE 
(2000) established 
as a theorem in 2002 
Consistency 
Availability 
Partition Tolerance 
• Consistency ⇒ all nodes should see the same data, 
eventually 
• Availability ⇒ System is still working when node(s) 
fails
CAP THEOREM 
CAP CONJECTURE 
(2000) established 
as a theorem in 2002 
Consistency 
Availability 
Partition Tolerance 
• Consistency ⇒ all nodes should see the same data, 
eventually 
• Availability ⇒ System is still working when node(s) 
fails • Partition-Tolerance ⇒ System is still working on 
arbitrary message loss
How does knowing 
FLP and CAP help me?
How does knowing 
FLP and CAP help me? 
It’s really about asking which 
would you sacrifice when a 
system fails? Consistency 
or Availability?
You can choose C over A 
It needs to preserve reads 
and writes i.e. linearizability 
i.e. A read will return the last 
completed write (on ANY 
replica) e.g. Two-Phase-Commit
You can choose A over C 
It might return stale 
reads …
You can choose A over C 
It might return stale 
reads … 
DynamoDB uses vector clocks; 
Cassandra uses a clever form 
of last-write-wins
You cannot NOT choose P 
Partition Tolerance is mandatory 
in distributed systems
To scale, partition 
To resilient, replicate
Partitioning Replication 
A B C 
A B A B 
[0..100] [0..100] [101..200] [101..200] 
C C 
Node 1 Node 2
Replication is strongly 
related to fault-tolerance 
Fault-tolerance relies 
on reaching consensus 
Paxos (1989), ZAB, Raft 
(2013)
Consistent Hashing is 
used often PARTITIONING 
and REPLICATION 
C-H commonly used in load-balancing 
web objects. In 
DynamoDB Cassandra, Riak and 
Memcached
• If your problem can fit into memory of a 
single machine, you don’t need a 
distributed system 
• If you really need to design / build a 
distributed system, the following helps: 
• Design for failure 
• Use the FLP and CAP to critique 
systems 
• Algorithms that work for single-node 
may not work in distributed mode 
• Avoid coordination of nodes 
• Learn to estimate your capacity
Jeff Dean - Google
REFERENCES 
• http://queue.acm.org/detail.cfm?id=2655736 “The network is 
reliable” 
• http://cacm.acm.org/blogs/blog-cacm/83396-errors-in-database-systems- 
eventual-consistency-and-the-cap-theorem/fulltext 
• http://mvdirona.com/jrh/talksAndPapers/JamesRH_Lisa.pdf 
• “Towards Robust Distributed Systems,” http:// 
www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf 
• “Why Do Computers Stop and What Can be Done About It,” 
Tandem Computers Technical Report 85.7, Cupertino, Ca., 1985. 
http://www.hpl.hp.com/techreports/tandem/TR-85.7.pdf 
• http://www.cs.cornell.edu/projects/ladis2009/talks/dean-keynote- 
ladis2009.pdf 
• http://codahale.com/you-cant-sacrifice-partition-tolerance/ 
• http://dl.acm.org/citation.cfm?id=258660 “Consistent hashing 
and random trees: distributed caching protocols for relieving hot 
spots on the World Wide Web”

More Related Content

What's hot

Design issues of dos
Design issues of dosDesign issues of dos
Design issues of dos
vanamali_vanu
 
Distributed Systems
Distributed SystemsDistributed Systems
Distributed SystemsRupsee
 
Distributed system notes unit I
Distributed system notes unit IDistributed system notes unit I
Distributed system notes unit I
NANDINI SHARMA
 
WAN & LAN Cluster with Diagrams and OSI explanation
WAN & LAN Cluster with Diagrams and OSI explanationWAN & LAN Cluster with Diagrams and OSI explanation
WAN & LAN Cluster with Diagrams and OSI explanationJonathan Reid
 
7 distributed and real systems
7 distributed and real systems7 distributed and real systems
7 distributed and real systemsmyrajendra
 
Cluster Tutorial
Cluster TutorialCluster Tutorial
Cluster Tutorialcybercbm
 
Cluster computing
Cluster computingCluster computing
Cluster computing
pooja khatana
 
distributed Computing system model
distributed Computing system modeldistributed Computing system model
distributed Computing system model
Harshad Umredkar
 
Cluster Computing
Cluster ComputingCluster Computing
Cluster Computing
BOSS Webtech
 
Distributed Computing
Distributed ComputingDistributed Computing
Distributed Computing
Sudarsun Santhiappan
 
Parallel Computing
Parallel ComputingParallel Computing
Parallel Computing
Ameya Waghmare
 
Distributed computing ).ppt him
Distributed computing ).ppt himDistributed computing ).ppt him
Distributed computing ).ppt him
Himanshu Saini
 
Distributed computing environment
Distributed computing environmentDistributed computing environment
Distributed computing environment
Ravi Bhushan
 
System models for distributed and cloud computing
System models for distributed and cloud computingSystem models for distributed and cloud computing
System models for distributed and cloud computingpurplesea
 
Introduction to Parallel and Distributed Computing
Introduction to Parallel and Distributed ComputingIntroduction to Parallel and Distributed Computing
Introduction to Parallel and Distributed Computing
Sayed Chhattan Shah
 
Cluster computer
Cluster  computerCluster  computer
Cluster computer
Ashraful Hoda
 
Distributed & parallel system
Distributed & parallel systemDistributed & parallel system
Distributed & parallel systemManish Singh
 
Tutorial on Parallel Computing and Message Passing Model - C1
Tutorial on Parallel Computing and Message Passing Model - C1Tutorial on Parallel Computing and Message Passing Model - C1
Tutorial on Parallel Computing and Message Passing Model - C1Marcirio Chaves
 

What's hot (20)

Design issues of dos
Design issues of dosDesign issues of dos
Design issues of dos
 
Distributed Systems
Distributed SystemsDistributed Systems
Distributed Systems
 
Distributed system notes unit I
Distributed system notes unit IDistributed system notes unit I
Distributed system notes unit I
 
WAN & LAN Cluster with Diagrams and OSI explanation
WAN & LAN Cluster with Diagrams and OSI explanationWAN & LAN Cluster with Diagrams and OSI explanation
WAN & LAN Cluster with Diagrams and OSI explanation
 
7 distributed and real systems
7 distributed and real systems7 distributed and real systems
7 distributed and real systems
 
Cluster Tutorial
Cluster TutorialCluster Tutorial
Cluster Tutorial
 
Cluster computing
Cluster computingCluster computing
Cluster computing
 
distributed Computing system model
distributed Computing system modeldistributed Computing system model
distributed Computing system model
 
CLUSTER COMPUTING
CLUSTER COMPUTINGCLUSTER COMPUTING
CLUSTER COMPUTING
 
Cluster Computing
Cluster ComputingCluster Computing
Cluster Computing
 
Distributed Computing
Distributed ComputingDistributed Computing
Distributed Computing
 
Parallel Computing
Parallel ComputingParallel Computing
Parallel Computing
 
Distributed computing ).ppt him
Distributed computing ).ppt himDistributed computing ).ppt him
Distributed computing ).ppt him
 
Distributed computing environment
Distributed computing environmentDistributed computing environment
Distributed computing environment
 
System models for distributed and cloud computing
System models for distributed and cloud computingSystem models for distributed and cloud computing
System models for distributed and cloud computing
 
Introduction to Parallel and Distributed Computing
Introduction to Parallel and Distributed ComputingIntroduction to Parallel and Distributed Computing
Introduction to Parallel and Distributed Computing
 
Cluster computer
Cluster  computerCluster  computer
Cluster computer
 
Distributed & parallel system
Distributed & parallel systemDistributed & parallel system
Distributed & parallel system
 
Lec2
Lec2Lec2
Lec2
 
Tutorial on Parallel Computing and Message Passing Model - C1
Tutorial on Parallel Computing and Message Passing Model - C1Tutorial on Parallel Computing and Message Passing Model - C1
Tutorial on Parallel Computing and Message Passing Model - C1
 

Similar to Distributed computing for new bloods

CAP Theorem and Split Brain Syndrome
CAP Theorem and Split Brain SyndromeCAP Theorem and Split Brain Syndrome
CAP Theorem and Split Brain Syndrome
Dilum Bandara
 
CAP Theorem - Theory, Implications and Practices
CAP Theorem - Theory, Implications and PracticesCAP Theorem - Theory, Implications and Practices
CAP Theorem - Theory, Implications and Practices
Yoav Francis
 
From Mainframe to Microservice: An Introduction to Distributed Systems
From Mainframe to Microservice: An Introduction to Distributed SystemsFrom Mainframe to Microservice: An Introduction to Distributed Systems
From Mainframe to Microservice: An Introduction to Distributed Systems
Tyler Treat
 
Database Expert Q&A from 2600hz and Cloudant
Database Expert Q&A from 2600hz and CloudantDatabase Expert Q&A from 2600hz and Cloudant
Database Expert Q&A from 2600hz and Cloudant
Joshua Goldbard
 
Data Engineering for Data Scientists
Data Engineering for Data Scientists Data Engineering for Data Scientists
Data Engineering for Data Scientists
jlacefie
 
The Reactive Principles: Eight Tenets For Building Cloud Native Applications
The Reactive Principles: Eight Tenets For Building Cloud Native ApplicationsThe Reactive Principles: Eight Tenets For Building Cloud Native Applications
The Reactive Principles: Eight Tenets For Building Cloud Native Applications
Lightbend
 
The Reactive Principles: Design Principles For Cloud Native Applications
The Reactive Principles: Design Principles For Cloud Native ApplicationsThe Reactive Principles: Design Principles For Cloud Native Applications
The Reactive Principles: Design Principles For Cloud Native Applications
Jonas Bonér
 
Lightning talk: highly scalable databases and the PACELC theorem
Lightning talk: highly scalable databases and the PACELC theoremLightning talk: highly scalable databases and the PACELC theorem
Lightning talk: highly scalable databases and the PACELC theorem
Vishal Bardoloi
 
Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...
Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...
Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...
confluent
 
CAP, PACELC, and Determinism
CAP, PACELC, and DeterminismCAP, PACELC, and Determinism
CAP, PACELC, and DeterminismDaniel Abadi
 
Go Reactive: Building Responsive, Resilient, Elastic & Message-Driven Systems
Go Reactive: Building Responsive, Resilient, Elastic & Message-Driven SystemsGo Reactive: Building Responsive, Resilient, Elastic & Message-Driven Systems
Go Reactive: Building Responsive, Resilient, Elastic & Message-Driven Systems
Jonas Bonér
 
Data consistency: Analyse, understand and decide
Data consistency: Analyse, understand and decideData consistency: Analyse, understand and decide
Data consistency: Analyse, understand and decide
Louis Jacomet
 
Everything you always wanted to know about Distributed databases, at devoxx l...
Everything you always wanted to know about Distributed databases, at devoxx l...Everything you always wanted to know about Distributed databases, at devoxx l...
Everything you always wanted to know about Distributed databases, at devoxx l...
javier ramirez
 
Intro to distributed systems
Intro to distributed systemsIntro to distributed systems
Intro to distributed systems
Ahmed Soliman
 
Simple Solutions for Complex Problems
Simple Solutions for Complex ProblemsSimple Solutions for Complex Problems
Simple Solutions for Complex Problems
Tyler Treat
 
dist_systems.pdf
dist_systems.pdfdist_systems.pdf
dist_systems.pdf
CherenetToma
 
Simple Solutions for Complex Problems
Simple Solutions for Complex Problems Simple Solutions for Complex Problems
Simple Solutions for Complex Problems
Apcera
 
Architecting for failure - Why are distributed systems hard?
Architecting for failure - Why are distributed systems hard?Architecting for failure - Why are distributed systems hard?
Architecting for failure - Why are distributed systems hard?
Markus Eisele
 
Inerview Quesion on Data Mining and Machine Learning
Inerview Quesion on Data Mining and Machine LearningInerview Quesion on Data Mining and Machine Learning
Inerview Quesion on Data Mining and Machine Learning
Yash Diwakar
 

Similar to Distributed computing for new bloods (20)

CAP Theorem and Split Brain Syndrome
CAP Theorem and Split Brain SyndromeCAP Theorem and Split Brain Syndrome
CAP Theorem and Split Brain Syndrome
 
Introduction
IntroductionIntroduction
Introduction
 
CAP Theorem - Theory, Implications and Practices
CAP Theorem - Theory, Implications and PracticesCAP Theorem - Theory, Implications and Practices
CAP Theorem - Theory, Implications and Practices
 
From Mainframe to Microservice: An Introduction to Distributed Systems
From Mainframe to Microservice: An Introduction to Distributed SystemsFrom Mainframe to Microservice: An Introduction to Distributed Systems
From Mainframe to Microservice: An Introduction to Distributed Systems
 
Database Expert Q&A from 2600hz and Cloudant
Database Expert Q&A from 2600hz and CloudantDatabase Expert Q&A from 2600hz and Cloudant
Database Expert Q&A from 2600hz and Cloudant
 
Data Engineering for Data Scientists
Data Engineering for Data Scientists Data Engineering for Data Scientists
Data Engineering for Data Scientists
 
The Reactive Principles: Eight Tenets For Building Cloud Native Applications
The Reactive Principles: Eight Tenets For Building Cloud Native ApplicationsThe Reactive Principles: Eight Tenets For Building Cloud Native Applications
The Reactive Principles: Eight Tenets For Building Cloud Native Applications
 
The Reactive Principles: Design Principles For Cloud Native Applications
The Reactive Principles: Design Principles For Cloud Native ApplicationsThe Reactive Principles: Design Principles For Cloud Native Applications
The Reactive Principles: Design Principles For Cloud Native Applications
 
Lightning talk: highly scalable databases and the PACELC theorem
Lightning talk: highly scalable databases and the PACELC theoremLightning talk: highly scalable databases and the PACELC theorem
Lightning talk: highly scalable databases and the PACELC theorem
 
Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...
Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...
Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...
 
CAP, PACELC, and Determinism
CAP, PACELC, and DeterminismCAP, PACELC, and Determinism
CAP, PACELC, and Determinism
 
Go Reactive: Building Responsive, Resilient, Elastic & Message-Driven Systems
Go Reactive: Building Responsive, Resilient, Elastic & Message-Driven SystemsGo Reactive: Building Responsive, Resilient, Elastic & Message-Driven Systems
Go Reactive: Building Responsive, Resilient, Elastic & Message-Driven Systems
 
Data consistency: Analyse, understand and decide
Data consistency: Analyse, understand and decideData consistency: Analyse, understand and decide
Data consistency: Analyse, understand and decide
 
Everything you always wanted to know about Distributed databases, at devoxx l...
Everything you always wanted to know about Distributed databases, at devoxx l...Everything you always wanted to know about Distributed databases, at devoxx l...
Everything you always wanted to know about Distributed databases, at devoxx l...
 
Intro to distributed systems
Intro to distributed systemsIntro to distributed systems
Intro to distributed systems
 
Simple Solutions for Complex Problems
Simple Solutions for Complex ProblemsSimple Solutions for Complex Problems
Simple Solutions for Complex Problems
 
dist_systems.pdf
dist_systems.pdfdist_systems.pdf
dist_systems.pdf
 
Simple Solutions for Complex Problems
Simple Solutions for Complex Problems Simple Solutions for Complex Problems
Simple Solutions for Complex Problems
 
Architecting for failure - Why are distributed systems hard?
Architecting for failure - Why are distributed systems hard?Architecting for failure - Why are distributed systems hard?
Architecting for failure - Why are distributed systems hard?
 
Inerview Quesion on Data Mining and Machine Learning
Inerview Quesion on Data Mining and Machine LearningInerview Quesion on Data Mining and Machine Learning
Inerview Quesion on Data Mining and Machine Learning
 

More from Raymond Tay

Principled io in_scala_2019_distribution
Principled io in_scala_2019_distributionPrincipled io in_scala_2019_distribution
Principled io in_scala_2019_distribution
Raymond Tay
 
Building a modern data platform with scala, akka, apache beam
Building a modern data platform with scala, akka, apache beamBuilding a modern data platform with scala, akka, apache beam
Building a modern data platform with scala, akka, apache beam
Raymond Tay
 
Practical cats
Practical catsPractical cats
Practical cats
Raymond Tay
 
Toying with spark
Toying with sparkToying with spark
Toying with spark
Raymond Tay
 
Functional programming with_scala
Functional programming with_scalaFunctional programming with_scala
Functional programming with_scala
Raymond Tay
 
Introduction to cuda geek camp singapore 2011
Introduction to cuda   geek camp singapore 2011Introduction to cuda   geek camp singapore 2011
Introduction to cuda geek camp singapore 2011
Raymond Tay
 
Introduction to Erlang
Introduction to ErlangIntroduction to Erlang
Introduction to Erlang
Raymond Tay
 
Introduction to CUDA
Introduction to CUDAIntroduction to CUDA
Introduction to CUDA
Raymond Tay
 

More from Raymond Tay (8)

Principled io in_scala_2019_distribution
Principled io in_scala_2019_distributionPrincipled io in_scala_2019_distribution
Principled io in_scala_2019_distribution
 
Building a modern data platform with scala, akka, apache beam
Building a modern data platform with scala, akka, apache beamBuilding a modern data platform with scala, akka, apache beam
Building a modern data platform with scala, akka, apache beam
 
Practical cats
Practical catsPractical cats
Practical cats
 
Toying with spark
Toying with sparkToying with spark
Toying with spark
 
Functional programming with_scala
Functional programming with_scalaFunctional programming with_scala
Functional programming with_scala
 
Introduction to cuda geek camp singapore 2011
Introduction to cuda   geek camp singapore 2011Introduction to cuda   geek camp singapore 2011
Introduction to cuda geek camp singapore 2011
 
Introduction to Erlang
Introduction to ErlangIntroduction to Erlang
Introduction to Erlang
 
Introduction to CUDA
Introduction to CUDAIntroduction to CUDA
Introduction to CUDA
 

Recently uploaded

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 

Recently uploaded (20)

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 

Distributed computing for new bloods

  • 1. DISTRIBUTED COMPUTING FOR NEW BLOODS RAYMOND TAY I WORK IN HEWLETT-PACKARD LABS SINGAPORE @RAYMONDTAYBL
  • 2. What is a Distributed System?
  • 3. What is a Distributed System? It’s been there for a long time
  • 4. What is a Distributed System? It’s been there for a long time You did not realize it
  • 5. DISTRIBUTED SYSTEMS ARE THE NEW FRONTIER whether YOU like it or not
  • 6. DISTRIBUTED SYSTEMS ARE THE NEW FRONTIER whether YOU like it or not LAN Games Mobile Databases ATMs Social Media
  • 7. WHAT IS THE ESSENCE OF DISTRIBUTED SYSTEMS?
  • 8. WHAT IS THE ESSENCE OF DISTRIBUTED SYSTEMS? IT IS ABOUT THE ATTEMPT TO OVERCOME INFORMATION TRAVEL WHEN INDEPENDENT PROCESSES FAIL
  • 9. Information flows at speed of light ! MESSAGE X Y
  • 10. When independent things DONT fail. I’ve sent it. I’ve received it. X Y I’ve received it. I’ve sent it.
  • 11. When independent things fail, independently. X Y NETWORK FAILURE
  • 12. When independent things fail, independently. X Y NODE FAILURE
  • 13. There is no difference between a slow node and a dead node
  • 14. Why do we NEED it? Scalability when the needs of the system outgrows what a single node can provide
  • 15. Why do we NEED it? Scalability when the needs of the system outgrows what a single node can provide Availability Enabling resilience when a node fails
  • 16. IS IT REALLY THAT HARD?
  • 17. IS IT REALLY THAT HARD? The answer lies in knowing what is possible and what is not possible…
  • 18. IS IT REALLY THAT HARD? Even widely accepted facts are challenged…
  • 19. THE FALLACIES OF DISTRIBUTED COMPUTING The network is reliable Peter Deutsch and other fellows at Sun Microsystems
  • 20. THE FALLACIES OF DISTRIBUTED COMPUTING The network is reliable
  • 21. THE FALLACIES OF DISTRIBUTED COMPUTING The network is reliable The network is secure Peter Deutsch and other fellows at Sun Microsystems
  • 22. THE FALLACIES OF DISTRIBUTED COMPUTING The network is reliable The network is secure 2013 cost of cyber crime study: United States
  • 23. THE FALLACIES OF DISTRIBUTED COMPUTING The network is reliable Peter Deutsch and The network is secure other fellows at Sun The network is Microsystems homogeneous
  • 24. THE FALLACIES OF DISTRIBUTED COMPUTING The network is reliable Peter Deutsch and The network is secure other fellows at Sun The network is Microsystems homogeneous Latency is zero Ingo Rammer on latency vs bandwidth
  • 25. THE FALLACIES OF DISTRIBUTED COMPUTING The network is reliable Peter Deutsch and The network is secure other fellows at Sun The network is Microsystems homogeneous Latency is zero Bandwidth is infinite
  • 26. THE FALLACIES OF DISTRIBUTED COMPUTING The network is reliable Peter Deutsch and The network is secure other fellows The network is at Sun homogeneous Microsystems Latency is zero Bandwidth is infinite Topology doesn’t change
  • 27. THE FALLACIES OF DISTRIBUTED COMPUTING The network is reliable The network is secure Peter Deutsch and The network is other fellows homogeneous at Sun Microsystems Latency is zero Bandwidth is infinite Topology doesn’t change There is one administrator
  • 28. THE FALLACIES OF DISTRIBUTED COMPUTING The network is reliable The network is secure Peter Deutsch and The network is homogeneous other fellows at Sun Latency is zero Microsystems Bandwidth is infinite Topology doesn’t change There is one administrator Transport cost is zero
  • 29. THE FALLACIES OF DISTRIBUTED COMPUTING The network is reliable The network is secure Peter Deutsch and The network is homogeneous other fellows at Sun Latency is zero Microsystems Bandwidth is infinite Topology doesn’t change There is one administrator Transport cost is zero
  • 30. WHAT WAS TRIED IN THE PAST? DISTRIBUTED OBJECTS REMOTE PROCEDURE CALL DISTRIBUTED SHARED MUTABLE STATE
  • 31. WHAT WAS ATTEMPTED? DISTRIBUTED OBJECTS REMOTE PROCEDURE CALL DISTRIBUTED SHARED MUTABLE STATE • Typically, programming constructs to “abstract” the fact that there are local and distributed objects • Ignores latencies • Handles failures with this attitude → “i dunno what to do…YOU (i.e. invoker) handle it”.
  • 32. WHAT WAS ATTEMPTED? DISTRIBUTED OBJECTS REMOTE PROCEDURE CALL DISTRIBUTED SHARED MUTABLE STATE • Assumes the synchronous processing model • Asynchronous RPC tries to model after synchronous RPC…
  • 33. WHAT WAS ATTEMPTED? DISTRIBUTED OBJECTS REMOTE PROCEDURE CALL DISTRIBUTED SHARED MUTABLE STATE • Found in Distributed Shared Memory Systems i.e. “1” address space partitioned into “x” address spaces for “x” nodes • e.g. JavaSpaces ⇒ Danger of 2 independent processes to commit successfully.
  • 34. YOU CAN APPRECIATE THAT IT IS VERY HARD TO SOLVE IN ITS ENTIRETY
  • 35. The question really is “How can we do better?”
  • 36. The question really is “How can we do better?” To start,we need to understand 2 results
  • 37. FLP IMPOSSIBILITY RESULT The FLP result shows that in an asynchronous model, where one processor might crash, there is no distributed algorithm to solve the consensus problem.
  • 38. Consensus is a fundamental problem in fault-tolerant distributed systems. Consensus involves multiple servers agreeing on values. Once they reach a decision on a value, that decision is final. Typical consensus algorithms make progress when any majority of their servers are available; for example, a cluster of 5 servers can continue to operate even if 2 servers fail. If more servers fail, they stop making progress (but will never return an incorrect result). Paxos (1989), Raft (2013)
  • 39. CAP THEOREM CAP CONJECTURE (2000) established as a theorem in 2002 Consistency Availability Partition Tolerance
  • 40. CAP THEOREM CAP CONJECTURE (2000) established as a theorem in 2002 Consistency Availability Partition Tolerance • Consistency ⇒ all nodes should see the same data, eventually
  • 41. CAP THEOREM CAP CONJECTURE (2000) established as a theorem in 2002 Consistency Availability Partition Tolerance • Consistency ⇒ all nodes should see the same data, eventually • Availability ⇒ System is still working when node(s) fails
  • 42. CAP THEOREM CAP CONJECTURE (2000) established as a theorem in 2002 Consistency Availability Partition Tolerance • Consistency ⇒ all nodes should see the same data, eventually • Availability ⇒ System is still working when node(s) fails
  • 43. CAP THEOREM CAP CONJECTURE (2000) established as a theorem in 2002 Consistency Availability Partition Tolerance • Consistency ⇒ all nodes should see the same data, eventually • Availability ⇒ System is still working when node(s) fails • Partition-Tolerance ⇒ System is still working on arbitrary message loss
  • 44. How does knowing FLP and CAP help me?
  • 45. How does knowing FLP and CAP help me? It’s really about asking which would you sacrifice when a system fails? Consistency or Availability?
  • 46. You can choose C over A It needs to preserve reads and writes i.e. linearizability i.e. A read will return the last completed write (on ANY replica) e.g. Two-Phase-Commit
  • 47. You can choose A over C It might return stale reads …
  • 48. You can choose A over C It might return stale reads … DynamoDB uses vector clocks; Cassandra uses a clever form of last-write-wins
  • 49. You cannot NOT choose P Partition Tolerance is mandatory in distributed systems
  • 50. To scale, partition To resilient, replicate
  • 51. Partitioning Replication A B C A B A B [0..100] [0..100] [101..200] [101..200] C C Node 1 Node 2
  • 52. Replication is strongly related to fault-tolerance Fault-tolerance relies on reaching consensus Paxos (1989), ZAB, Raft (2013)
  • 53. Consistent Hashing is used often PARTITIONING and REPLICATION C-H commonly used in load-balancing web objects. In DynamoDB Cassandra, Riak and Memcached
  • 54. • If your problem can fit into memory of a single machine, you don’t need a distributed system • If you really need to design / build a distributed system, the following helps: • Design for failure • Use the FLP and CAP to critique systems • Algorithms that work for single-node may not work in distributed mode • Avoid coordination of nodes • Learn to estimate your capacity
  • 55. Jeff Dean - Google
  • 56. REFERENCES • http://queue.acm.org/detail.cfm?id=2655736 “The network is reliable” • http://cacm.acm.org/blogs/blog-cacm/83396-errors-in-database-systems- eventual-consistency-and-the-cap-theorem/fulltext • http://mvdirona.com/jrh/talksAndPapers/JamesRH_Lisa.pdf • “Towards Robust Distributed Systems,” http:// www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf • “Why Do Computers Stop and What Can be Done About It,” Tandem Computers Technical Report 85.7, Cupertino, Ca., 1985. http://www.hpl.hp.com/techreports/tandem/TR-85.7.pdf • http://www.cs.cornell.edu/projects/ladis2009/talks/dean-keynote- ladis2009.pdf • http://codahale.com/you-cant-sacrifice-partition-tolerance/ • http://dl.acm.org/citation.cfm?id=258660 “Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web”