SlideShare a Scribd company logo
1 of 82
Unit 2 – Cloud Computing
Applications and Paradigms
1
Contents
■ Challenges for cloud computing.
■ Architectural styles for cloud applications.
■ Workflows - coordination of multiple activities.
■ Coordination based on a state machine model:
The Zookeeper
■ The MapReduce programming model.
■ A case study: the GrepTheWeb application.
■ Clouds for science and engineering.
■ High performance computing on a cloud.
2
Cloud applications
■ Cloud computing is very attractive to the users:
■ Economic reasons.
❖ low infrastructure investment.
❖ low cost - customers are only billed for resources
used.
3
■ Convenience and performance.
❖ Developers enjoy the advantages of a just-in-time
infrastructure; they are free to design an application
without being concerned with the system where the
application will run.
❖ Execution time of compute-intensive and data-intensive
applications can, potentially, be reduced through
parallelization. If an application can partition the
workload in n segments and spawn n instances of itself,
then the execution time could be reduced by a factor
close to n.
4
■ Cloud computing is also beneficial for the providers of
computing cycles - it typically leads to a higher level of
resource utilization.
5
Cloud applications (cont’d)
■ Ideal applications for cloud computing:
❖ Web services. : -Ex. Salesforce.com
❖ Database services.:- Ex.Google app engine
❖ Transaction-based service. :- The resource
requirements of transaction-oriented services benefit
from an elastic environment where resources are
available when needed and where one pays only for the
resources it consumes.
6
■ Not all applications are suitable for cloud computing:
❖ Applications with a complex workflow and multiple
dependencies, as is often the case in high-
performance computing.
❖ Applications which require intensive communication
among concurrent instances.
❖ When the workload cannot be arbitrarily partitioned.
7
2.1 Challenges for cloud application
development
■ Performance isolation - nearly impossible to reach in a
real system, especially when the system is heavily
loaded.
■ Reliability - major concern; server failures expected when
a large number of servers cooperate for the computations.
8
■ Cloud infrastructure exhibits latency and bandwidth
fluctuations which affect the application performance.
■ Performance considerations limit the amount of data
logging; the ability to identify the source of unexpected
results and errors is helped by frequent logging.
9
2.2 Architectural styles for cloud applications
■ Based on the client-server paradigm.
■ Stateless servers - view a client request as an
independent transaction and respond to it; the client
is not required to first establish a connection to the
server.
■ Often clients and servers communicate using Remote
Procedure Calls (RPCs).
10
■ Simple Object Access Protocol (SOAP) - application
protocol for web applications; message format based on
the XML. Uses TCP or UDP transport protocols.
■ Representational State Transfer (REST) - software
architecture for distributed hypermedia systems.
Supports client communication with stateless servers, it
is platform independent, language independent, supports
data caching, and can be used in the presence of
firewalls.
11
2.3 Workflows: Coordination of multiple
activities
workflow
■ Many cloud applications require the completion of
multiple interdependent tasks, the description of a
complex activity involving such tasks is known as a
workflow.
Workflow models
■ Workflow models are abstractions revealing the most
important properties of the entities participating in a
workflow management system.
12
Task
■ Task is the central concept in workflow modeling; a task is
a unit of work to be performed on the cloud.
Attributes of task
1. Name: A string of characters uniquely identifying the task.
2. Description: A natural language description of the task.
3. Actions: Modifications of the environment caused by the
execution of the task.
4. Preconditions: Boolean expressions that must be true
before the action(s) of the task can take place.
13
5. Post-conditions: Boolean expressions that must be true
after the action(s) of the task take place.
5. Attributes: Provide indications of the type and quantity
of resources necessary for the execution of the task, the
actors in charge of the tasks, the security requirements,
whether the task is reversible, and other task
characteristics.
14
7. Exceptions: Provide information on how to handle
abnormal events.
❖ The exceptions supported by a task consist of a list of
<event, action> pairs.
❖ The exceptions included in the task exception list are
called anticipated exceptions, as opposed to
unanticipated exceptions.
❖ Events not included in the exception list trigger
replanning.
❖ Replanning means restructuring of a process or
redefinition of the relationship among various tasks.
15
Different types of Tasks
■ A composite task : is a structure describing a subset of
tasks and the order of their execution.
■ A primitive task : is one that cannot be decomposed
into simpler tasks.
❖ A composite task inherits some properties from
workflows; it consists of tasks and has one start
symbol and possibly several end symbols.
❖ At the same time, a composite task inherits some
properties from tasks; it has a name, preconditions,
and post-conditions.
16
■ A routing task is a special-purpose task connecting two
tasks in a workflow description.
■ The task that has just completed execution is called the
predecessor task;
■ the one to be initiated next is called the successor task.
■ A routing task could trigger a sequential, concurrent, or
iterative execution. Several types of routing task exist:
17
■ A fork routing task triggers execution of several
successor tasks. Several semantics for this construct are
possible:
• All successor tasks are enabled.
• Each successor task is associated with a condition.
The conditions for all tasks are evaluated, and only
the tasks with a true condition are enabled.
• Each successor task is associated with a condition.
The conditions for all tasks are evaluated, but the
conditions are mutually exclusive and only one
condition may be true. Thus, only one task is enabled.
• Nondeterministic, k out of n > k successors are
selected at random to be enabled.
18
■ A join routing task waits for completion of its
predecessor tasks. There are several semantics for the
join routing task:
• The successor is enabled after all predecessors end.
• The successor is enabled after k out of n > k
predecessors end.
• Iterative: The tasks between the fork and the join are
executed repeatedly.
19
■ Process description - structure describing the tasks
to be executed and the order of their execution.
Resembles a flowchart.
■ Case - an instance of a process description.
20
■ State of a case at time t - defined in terms of tasks
already completed at that time.
■ Events - cause transitions between states.
■ The life cycle of a workflow - creation, definition,
verification, and enactment; similar to the life cycle of a
traditional program (creation, compilation, and
execution).
21
22
Safety and liveness
■ Desirable properties of workflows.
■ Safety 🡪 nothing “bad” ever happens.
■ Liveness 🡪 something “good” will eventually happen.
23
24
25
26
(a) A process description that violates the liveness
requirement. If task C is chosen after completion of B,
the process will terminate after executing task G; if D is
chosen, then F will never be instantiated, because it
requires the completion of both C and E. The process
will never terminate, because G requires completion of
both D and F .
(b) Tasks A and B need exclusive access to two resources
r and q, and a deadlock may take place if the following
sequence of events occurs. At time t1 task A acquires r,
at time t2 task B acquires q and continues to run; then at
time t3 task B attempts to acquire r and it blocks
because r is under the control of A. Task A continues to
run and at time t4 attempts to acquire q and it blocks
because q is under the control of B.
Deadlock avoidance solution
■ The deadlock illustrated in Figure 1 (b) can be avoided
by requesting each task to acquire all resources at the
same time.
■ The price to pay is underutilization of resources.
■ Indeed, the idle time of each resource increases under
this scheme.
27
Basic workflow patterns
■ Workflow patterns - the temporal relationship among
the tasks of a process.
■ These patterns are classified in several categories:
basic, advanced branching and synchronization,
structural, state based, cancellation, and patterns
involving multiple instances.
28
29
■ Sequence - several tasks
have to be scheduled one
after the completion of
the other.
■ AND split - both tasks B
and C are activated when
task A terminates.
30
■ Synchronization - task
C can only start after
tasks A and B
terminate.
■ XOR split - after
completion of task A,
either B or C can be
activated.
31
■ XOR merge - task C
is enabled when
either A or B
terminate.
■ OR split - after
completion of task A
one could activate
either B, C, or both.
32
■ Multiple Merge - once task
A terminates, B and C
execute concurrently;
when the first of them, say
B, terminates, then D is
activated; then, when C
terminates, D is activated
again.
■ Discriminator – wait for a no.
of incoming branches to
complete before activating
the subsequent activity; then
wait for the remaining
branches to finish without
taking any action until all of
them have terminated. Next,
resets itself.
33
■ N out of M join - barrier
synchronization. Assuming that
M tasks run concurrently, N
(N<M) of them have to reach
the barrier before the next task
is enabled. In our example, any
two out of the three tasks A, B,
and C have to finish before E is
enabled.
■ Deferred Choice -
similar to the XOR
split but the choice
is not made
explicitly; the run-
time environment
decides what
branch to take.
34
2.4 Coordination based on a state machine
model : ZooKeeper
■ Cloud elasticity 🡪 distribute computations and data across
multiple systems; coordination among these systems is a
critical function in a distributed environment.
35
36
37
38
ZooKeeper
▪ Distributed coordination service for large-scale
distributed systems.
▪ High throughput and low latency service.
▪ Open-source software written in Java with bindings for
Java and C.
39
▪ The servers in the pack communicate and elect a leader.
▪ A database is replicated on each server; consistency of
the replicas is maintained.
▪ A client connect to a single server using TCP,
synchronizes its clock with the server, and sends
requests, receives responses and watch events through a
TCP connection.
40
41
FIGURE : The ZooKeeper coordination service.
(a) The service provides a single system image. Clients can
connect to any server in the pack.
(b) Functional model of the ZooKeeper service. The
replicated database is accessed directly by read
commands; write commands involve more intricate
processing based on atomic broadcast.
(c) Processing a write command:
(1) A server receiving the command from a client
forwards the command to the leader;
(2) the leader uses atomic broadcast to reach consensus
among all followers.
42
Shared hierarchical namespace similar to a
file system; znodes instead of inodes
43
ZooKeeper service guarantees
■ Atomicity - a transaction either completes or fails.
■ Sequential consistency of updates - updates are applied
strictly in the order they are received.
■ Single system image for the clients - a client receives the
same response regardless of the server it connects to.
■ Persistence of updates - once applied, an update persists
until it is overwritten by a client.
■ Reliability - the system is guaranteed to function correctly
as long as the majority of servers function correctly.
44
Zookeeper communication
■ Messaging layer 🡪 responsible for the election of a
new leader when the current leader fails.
■ Messaging protocols use:
■ Packets - sequence of bytes sent through a
FIFO channel.
■ Proposals - units of agreement.
■ Messages - sequence of bytes atomically
broadcast to all servers.
45
Zookeeper communication (cont’d)
■ A message is included into a proposal and it is agreed
upon before it is delivered.
■ Proposals are agreed upon by exchanging packets with
a quorum of servers, as required by the Paxos algorithm.
46
Zookeeper communication (cont’d)
■ Messaging layer guarantees:
❖ Reliable delivery: if a message m is delivered to one
server, it will be eventually delivered to all servers.
❖ Total order: if message m is delivered before message n
to one server, it will be delivered before n to all servers.
❖ Causal order: if message n is sent after m has been
delivered by the sender of n, then m must be ordered
before n.
47
Zookeeper API
■ The API is simple - consists of seven operations:
❖ Create - add a node at a given location on the tree.
❖ Delete - delete a node.
❖ Get data - read data from a node.
❖ Set data - write data to a node.
❖ Get children - retrieve a list of the children of the node.
❖ Synch - wait for the data to propagate.
48
2.6 The MapReduce programming
model
Advantage of cloud computing: Elasticity and load
distribution
■ Elasticity 🡪 ability to use as many servers as necessary
to optimally respond to cost and timing constraints of an
application.
■ Load distribution 🡪 front-end system distributes the
incoming requests to a number of back-end systems and
attempts to balance the load among them. As the
workload increases, new back-end systems are added to
the pool.
49
Load distribution 🡪
■ How to divide the load
◻ Transaction processing systems 🡪 a front-end
distributes the incoming transactions to a number of
back-end systems. As the workload increases new
back-end systems are added to the pool.
◻ For data-intensive batch applications two types of
divisible workloads are possible:
■ modularly divisible 🡪 the workload partitioning is
defined a priori.
■ arbitrarily divisible 🡪 the workload can be
partitioned into an arbitrarily large number of
smaller workloads of equal, or very close size.
50
■ Many applications in physics, biology, and other areas of
computational science and engineering obey the
arbitrarily divisible load sharing model.
51
MapReduce philosophy
1. An application starts a master instance, M worker
instances for the Map phase and later R worker
instances for the Reduce phase.
1. The master instance partitions the input data in M
segments.
1. Each map instance reads its input data segment and
processes the data.
1. The results of the processing are stored on the local
disks of the servers where the map instances run.
52
5. When all map instances have finished processing their
data, the R reduce instances read the results of the first
phase and merge the partial results.
5. The final results are written by the reduce instances to a
shared storage server.
5. The master instance monitors the reduce instances and
when all of them report task completion the application
is terminated.
53
54
2.7 A Case study: The GrepTheWeb
application
■ The application illustrates the means to
◻ create an on-demand infrastructure.
◻ run it on a massively distributed system in a manner
that allows it to run in parallel and scale up and down,
based on the number of users and the problem size.
55
■ GrepTheWeb
◻ Performs a search of a very large set of
records to identify records that satisfy a regular
expression.
◻ It is analogous to the Unix grep command.
◻ The source is a collection of document URLs
produced by the Alexa Web Search, a software
system that crawls the web every night.
◻ Uses message passing to trigger the activities
of multiple controller threads which launch the
application, initiate processing, shutdown the
system, and create billing records.
56
(a) The simplified workflow
showing the inputs:
- the regular expression.
- the input records generated
by the web crawler.
- the user commands to report
the current status and to
terminate the processing.
(b) The detailed workflow.
The system is based on
message passing between
several queues; four
controller threads
periodically poll their
associated input queues,
retrieve messages, and
carry out the required
actions
57
FIGURE
■ The organization of the GrepTheWeb application.
■ The application uses the Hadoop MapReduce software
and four Amazon services: EC2, Simple DB, S3, and
SQS.
58
■ The simplified
workflow showing the
two inputs the regular
expression and the
input records
generated by the Web
crawler.
■ A third type of input is
the user commands to
report the current
status and to terminate
the processing.
59
(b) The detailed workflow
60
■ The system is based on message passing between
several queues; four controller threads periodically poll
their associated input queues, retrieve messages, and
carry out the required actions.
61
Steps of Workflow
1. The startup phase.
■ Creates several queues – launch, monitor, billing, and
shutdown queues.
■ Starts the corresponding controller threads. Each thread
periodically polls its input queue and, when a message
is available, retrieves the message, parses it, and takes
the required actions.
62
2. The processing phase.
■ This phase is triggered by a StartGrep user request; then
a launch message is enqueued in the launch queue.
■ The launch controller thread picks up the message and
executes the launch task; then, it updates the status and
time stamps in the Amazon Simple DB domain.
■ Finally, it enqueues a message in the monitor queue
and deletes the message from the launch queue.
63
■ The processing phase consists of the following steps:
a. The launch task starts Amazon EC2 instances. It uses a
Java Runtime Environment preinstalled Amazon
Machine Image (AMI), deploys required Hadoop
libraries, and starts a Hadoop Job (run Map/Reduce
tasks).
64
b. Hadoop runs map tasks on Amazon EC2 slave
nodes in parallel. A map task takes files from
Amazon S3, runs a regular expression, and writes the
match results locally, along with a description of up to
five matches.
■ Then the combine/reduce task combines and sorts the
results and consolidates the output.
c. Final results are stored on Amazon S3 in the output
bucket.
65
3. The monitoring phase.
■ The monitor controller thread retrieves the
message left at the beginning of the processing
phase, validates the status/error in Amazon
Simple DB, and executes the monitor task.
■ It updates the status in the Amazon Simple DB
domain and enqueues messages in the
shutdown and billing queues.
66
■ The monitor task checks for the Hadoop status
periodically and updates the Simple DB items
with status/error and the Amazon S3 output file.
■ Finally, it deletes the message from the monitor
queue when the processing is completed.
67
4. The shutdown phase.
■ The shutdown controller thread retrieves the
message from the shutdown queue and executes
the shutdown task, which updates the status and
time stamps in the Amazon Simple DB domain.
■ Finally, it deletes the message from the shutdown
queue after processing. The shutdown phase
consists of the following steps:
68
a. The shutdown task kills the Hadoop processes,
terminates the EC2 instances after getting EC2
topology information from Amazon Simple DB, and
disposes of the infrastructure.
b. The billing task gets the EC2 topology information,
Simple DB usage, and S3 file and query input,
calculates the charges, and passes the information to
the billing service.
69
5. The cleanup phase.
■ Archives the Simple DB data with user info.
6. User interactions with the system. Get the
status and output results. The GetStatus is applied
to the service endpoint to get the status of the overall
system (all controllers and Hadoop) and download the
filtered results from Amazon S3 after completion.
70
Conclusion :
■ This application illustrates the means to create an on-
demand infrastructure and run it on a massively
distributed system in a manner that allows it to run in
parallel and scale up and down based on the number of
users and the problem size.
71
2.8 Clouds for science and engineering
■ The generic problems in virtually all areas of
science are:
❖Collection of experimental data.
❖Management of very large volumes of data.
❖Building and execution of models.
❖Integration of data and literature.
❖Documentation of the experiments. Ex: CSV
File
❖Sharing the data with others; data
preservation for a long periods of time.
72
■ All these activities require “big” data storage and
systems capable to deliver abundant computing cycles.
■ Computing clouds are able to provide such resources
and support collaborative environments.
73
Online data discovery
■ Phases of data discovery in large scientific data
sets:
◻ recognition of the information problem.
◻ generation of search queries using one or more
search engines.
◻ evaluation of the search results.
◻ evaluation of the web documents.
◻ comparing information from different sources.
74
Large scientific data sets:
◻ biomedical and genomic data from the National
Center for Biotechnology Information (NCBI).
◻ astrophysics data from NASA.
◻ atmospheric data from the National Oceanic and
Atmospheric Administration (NOAA) and the National
Center for Atmospheric Research (NCAR).
75
2.9 High performance computing on a cloud
■ Comparative benchmark of EC2 and three
supercomputers at the National Energy Research
Scientific Computing Center (NERSC) at Lawrence
Berkeley National Laboratory.
■ NERSC has some 3,000 researchers and involves 400
projects based on some 600 codes.
76
■ Conclusion – communication-intensive applications are
affected by the increased latency and lower bandwidth of
the cloud.
■ The low latency and high bandwidth of the
interconnection network of a supercomputer cannot be
matched by a cloud.
77
The systems used for the comparison with cloud
computing are:
78
SLC: Legacy applications on the cloud
■ Is it feasible to run legacy applications on a cloud?
■ Cirrus - a general platform for executing legacy Windows
applications on the cloud. A Cirrus job - a prologue, commands,
and parameters. The prologue sets up the running environment; the
commands are sequences of shell scripts including Azure-storage-
related commands to transfer data between Azure blob storage and
the instance.
■ BLAST - a biology code which finds regions of local similarity
between sequences; it compares nucleotide or protein sequences
to sequence databases and calculates the statistical significance of
matches; used to infer functional and evolutionary relationships
between sequences and identify members of gene families.
■ AzureBLAST - a version of BLAST running on the Azure platform.
79
Cirrus
80
Execution of loosely-coupled workloads using the
Azure platform
81
Social computing and digital content
■ Networks allowing researchers to share data and provide a virtual
environment supporting remote execution of workflows are domain
specific:
◻ MyExperiment for biology.
◻ nanoHub for nanoscience.
■ Volunteer computing - a large population of users donate resources
such as CPU cycles and storage space for a specific project:
◻ Mersenne Prime Search
◻ SETI@Home,
◻ Folding@home,
◻ Storage@Home
◻ PlanetLab
■ Berkeley Open Infrastructure for Network Computing (BOINC) 🡪
middleware for a distributed infrastructure suitable for different
applications.
82

More Related Content

Similar to Cloud computing_Applications and paradigams.pptx

Flux architecture and Redux - theory, context and practice
Flux architecture and Redux - theory, context and practiceFlux architecture and Redux - theory, context and practice
Flux architecture and Redux - theory, context and practiceJakub Kocikowski
 
The Legion Programming Model for HPC
The Legion Programming Model for HPCThe Legion Programming Model for HPC
The Legion Programming Model for HPCinside-BigData.com
 
Chapter 5.pptx
Chapter 5.pptxChapter 5.pptx
Chapter 5.pptxJoeBaker69
 
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...Flink Forward
 
Microservices @ Work - A Practice Report of Developing Microservices
Microservices @ Work - A Practice Report of Developing MicroservicesMicroservices @ Work - A Practice Report of Developing Microservices
Microservices @ Work - A Practice Report of Developing MicroservicesQAware GmbH
 
Lect-6&7: Network Diagrams, PERT and CPM
Lect-6&7: Network Diagrams, PERT and CPMLect-6&7: Network Diagrams, PERT and CPM
Lect-6&7: Network Diagrams, PERT and CPMMubashir Ali
 
Network analysis cpm module3
Network analysis cpm module3Network analysis cpm module3
Network analysis cpm module3ahsanrabbani
 
'How to build efficient backend based on microservice architecture' by Anton ...
'How to build efficient backend based on microservice architecture' by Anton ...'How to build efficient backend based on microservice architecture' by Anton ...
'How to build efficient backend based on microservice architecture' by Anton ...OdessaJS Conf
 
02 Models of Distribution Systems.pdf
02 Models of Distribution Systems.pdf02 Models of Distribution Systems.pdf
02 Models of Distribution Systems.pdfRobeliaJoyVillaruz
 
Fyber - airflow best practices in production
Fyber - airflow best practices in productionFyber - airflow best practices in production
Fyber - airflow best practices in productionItai Yaffe
 
Cloudify workshop at CCCEU 2014
Cloudify workshop at CCCEU 2014 Cloudify workshop at CCCEU 2014
Cloudify workshop at CCCEU 2014 Uri Cohen
 
SecondPresentationDesigning_Parallel_Programs.ppt
SecondPresentationDesigning_Parallel_Programs.pptSecondPresentationDesigning_Parallel_Programs.ppt
SecondPresentationDesigning_Parallel_Programs.pptRubenGabrielHernande
 
Bootstrapping a ML platform at Bluevine [Airflow Summit 2020]
Bootstrapping a ML platform at Bluevine [Airflow Summit 2020]Bootstrapping a ML platform at Bluevine [Airflow Summit 2020]
Bootstrapping a ML platform at Bluevine [Airflow Summit 2020]Noam Elfanbaum
 
Network analysis
Network analysisNetwork analysis
Network analysisKamel Attar
 

Similar to Cloud computing_Applications and paradigams.pptx (20)

Flux architecture and Redux - theory, context and practice
Flux architecture and Redux - theory, context and practiceFlux architecture and Redux - theory, context and practice
Flux architecture and Redux - theory, context and practice
 
The Legion Programming Model for HPC
The Legion Programming Model for HPCThe Legion Programming Model for HPC
The Legion Programming Model for HPC
 
Chapter 5.pptx
Chapter 5.pptxChapter 5.pptx
Chapter 5.pptx
 
ESC UNIT 3.ppt
ESC UNIT 3.pptESC UNIT 3.ppt
ESC UNIT 3.ppt
 
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...
 
News UK - Our Journey to Cloud
News UK - Our Journey to CloudNews UK - Our Journey to Cloud
News UK - Our Journey to Cloud
 
Microservices @ Work - A Practice Report of Developing Microservices
Microservices @ Work - A Practice Report of Developing MicroservicesMicroservices @ Work - A Practice Report of Developing Microservices
Microservices @ Work - A Practice Report of Developing Microservices
 
Task assignment and scheduling
Task assignment and schedulingTask assignment and scheduling
Task assignment and scheduling
 
Lect-6&7: Network Diagrams, PERT and CPM
Lect-6&7: Network Diagrams, PERT and CPMLect-6&7: Network Diagrams, PERT and CPM
Lect-6&7: Network Diagrams, PERT and CPM
 
Network analysis cpm module3
Network analysis cpm module3Network analysis cpm module3
Network analysis cpm module3
 
'How to build efficient backend based on microservice architecture' by Anton ...
'How to build efficient backend based on microservice architecture' by Anton ...'How to build efficient backend based on microservice architecture' by Anton ...
'How to build efficient backend based on microservice architecture' by Anton ...
 
02 Models of Distribution Systems.pdf
02 Models of Distribution Systems.pdf02 Models of Distribution Systems.pdf
02 Models of Distribution Systems.pdf
 
Ch 5.pptx
Ch 5.pptxCh 5.pptx
Ch 5.pptx
 
Fyber - airflow best practices in production
Fyber - airflow best practices in productionFyber - airflow best practices in production
Fyber - airflow best practices in production
 
Cloudify workshop at CCCEU 2014
Cloudify workshop at CCCEU 2014 Cloudify workshop at CCCEU 2014
Cloudify workshop at CCCEU 2014
 
SecondPresentationDesigning_Parallel_Programs.ppt
SecondPresentationDesigning_Parallel_Programs.pptSecondPresentationDesigning_Parallel_Programs.ppt
SecondPresentationDesigning_Parallel_Programs.ppt
 
pert-cpm.pptx
pert-cpm.pptxpert-cpm.pptx
pert-cpm.pptx
 
Unit-3.ppt
Unit-3.pptUnit-3.ppt
Unit-3.ppt
 
Bootstrapping a ML platform at Bluevine [Airflow Summit 2020]
Bootstrapping a ML platform at Bluevine [Airflow Summit 2020]Bootstrapping a ML platform at Bluevine [Airflow Summit 2020]
Bootstrapping a ML platform at Bluevine [Airflow Summit 2020]
 
Network analysis
Network analysisNetwork analysis
Network analysis
 

More from MayuraD1

SDP_May2023:student developement program
SDP_May2023:student developement programSDP_May2023:student developement program
SDP_May2023:student developement programMayuraD1
 
cyber_security_brochure details of workshop
cyber_security_brochure details of workshopcyber_security_brochure details of workshop
cyber_security_brochure details of workshopMayuraD1
 
DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesMayuraD1
 
Machine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester ElectiveMachine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester ElectiveMayuraD1
 
Introduction to Machine Learning Elective Course
Introduction to Machine Learning Elective CourseIntroduction to Machine Learning Elective Course
Introduction to Machine Learning Elective CourseMayuraD1
 
Module1 of Introduction to Machine Learning
Module1 of Introduction to Machine LearningModule1 of Introduction to Machine Learning
Module1 of Introduction to Machine LearningMayuraD1
 
Research Methodology Module 1 useful for any course work
Research Methodology Module 1 useful for any course workResearch Methodology Module 1 useful for any course work
Research Methodology Module 1 useful for any course workMayuraD1
 
Cloud computing_Applications and paradigams.pptx
Cloud computing_Applications and paradigams.pptxCloud computing_Applications and paradigams.pptx
Cloud computing_Applications and paradigams.pptxMayuraD1
 
Introduction to Computer Forensics for all streams.
Introduction to Computer Forensics for all streams.Introduction to Computer Forensics for all streams.
Introduction to Computer Forensics for all streams.MayuraD1
 
Introduction about research methodology explained in simple language.
Introduction about research methodology explained in simple language.Introduction about research methodology explained in simple language.
Introduction about research methodology explained in simple language.MayuraD1
 
Cloud Computing Introduction. Engineering seventh Semester
Cloud Computing Introduction. Engineering seventh SemesterCloud Computing Introduction. Engineering seventh Semester
Cloud Computing Introduction. Engineering seventh SemesterMayuraD1
 

More from MayuraD1 (13)

SDP_May2023:student developement program
SDP_May2023:student developement programSDP_May2023:student developement program
SDP_May2023:student developement program
 
cyber_security_brochure details of workshop
cyber_security_brochure details of workshopcyber_security_brochure details of workshop
cyber_security_brochure details of workshop
 
DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakes
 
Machine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester ElectiveMachine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester Elective
 
Introduction to Machine Learning Elective Course
Introduction to Machine Learning Elective CourseIntroduction to Machine Learning Elective Course
Introduction to Machine Learning Elective Course
 
Module1 of Introduction to Machine Learning
Module1 of Introduction to Machine LearningModule1 of Introduction to Machine Learning
Module1 of Introduction to Machine Learning
 
Research Methodology Module 1 useful for any course work
Research Methodology Module 1 useful for any course workResearch Methodology Module 1 useful for any course work
Research Methodology Module 1 useful for any course work
 
Cloud computing_Applications and paradigams.pptx
Cloud computing_Applications and paradigams.pptxCloud computing_Applications and paradigams.pptx
Cloud computing_Applications and paradigams.pptx
 
Introduction to Computer Forensics for all streams.
Introduction to Computer Forensics for all streams.Introduction to Computer Forensics for all streams.
Introduction to Computer Forensics for all streams.
 
Introduction about research methodology explained in simple language.
Introduction about research methodology explained in simple language.Introduction about research methodology explained in simple language.
Introduction about research methodology explained in simple language.
 
Cloud Computing Introduction. Engineering seventh Semester
Cloud Computing Introduction. Engineering seventh SemesterCloud Computing Introduction. Engineering seventh Semester
Cloud Computing Introduction. Engineering seventh Semester
 
M2.pptx
M2.pptxM2.pptx
M2.pptx
 
M5.pptx
M5.pptxM5.pptx
M5.pptx
 

Recently uploaded

Independent Solar-Powered Electric Vehicle Charging Station
Independent Solar-Powered Electric Vehicle Charging StationIndependent Solar-Powered Electric Vehicle Charging Station
Independent Solar-Powered Electric Vehicle Charging Stationsiddharthteach18
 
Autodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptxAutodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptxMustafa Ahmed
 
litvinenko_Henry_Intrusion_Hong-Kong_2024.pdf
litvinenko_Henry_Intrusion_Hong-Kong_2024.pdflitvinenko_Henry_Intrusion_Hong-Kong_2024.pdf
litvinenko_Henry_Intrusion_Hong-Kong_2024.pdfAlexander Litvinenko
 
Final DBMS Manual (2).pdf final lab manual
Final DBMS Manual (2).pdf final lab manualFinal DBMS Manual (2).pdf final lab manual
Final DBMS Manual (2).pdf final lab manualBalamuruganV28
 
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...josephjonse
 
Software Engineering Practical File Front Pages.pdf
Software Engineering Practical File Front Pages.pdfSoftware Engineering Practical File Front Pages.pdf
Software Engineering Practical File Front Pages.pdfssuser5c9d4b1
 
Maximizing Incident Investigation Efficacy in Oil & Gas: Techniques and Tools
Maximizing Incident Investigation Efficacy in Oil & Gas: Techniques and ToolsMaximizing Incident Investigation Efficacy in Oil & Gas: Techniques and Tools
Maximizing Incident Investigation Efficacy in Oil & Gas: Techniques and Toolssoginsider
 
Worksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptxWorksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptxMustafa Ahmed
 
Dynamo Scripts for Task IDs and Space Naming.pptx
Dynamo Scripts for Task IDs and Space Naming.pptxDynamo Scripts for Task IDs and Space Naming.pptx
Dynamo Scripts for Task IDs and Space Naming.pptxMustafa Ahmed
 
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdfInvolute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdfJNTUA
 
Seismic Hazard Assessment Software in Python by Prof. Dr. Costas Sachpazis
Seismic Hazard Assessment Software in Python by Prof. Dr. Costas SachpazisSeismic Hazard Assessment Software in Python by Prof. Dr. Costas Sachpazis
Seismic Hazard Assessment Software in Python by Prof. Dr. Costas SachpazisDr.Costas Sachpazis
 
Diploma Engineering Drawing Qp-2024 Ece .pdf
Diploma Engineering Drawing Qp-2024 Ece .pdfDiploma Engineering Drawing Qp-2024 Ece .pdf
Diploma Engineering Drawing Qp-2024 Ece .pdfJNTUA
 
Filters for Electromagnetic Compatibility Applications
Filters for Electromagnetic Compatibility ApplicationsFilters for Electromagnetic Compatibility Applications
Filters for Electromagnetic Compatibility ApplicationsMathias Magdowski
 
Maher Othman Interior Design Portfolio..
Maher Othman Interior Design Portfolio..Maher Othman Interior Design Portfolio..
Maher Othman Interior Design Portfolio..MaherOthman7
 
The Entity-Relationship Model(ER Diagram).pptx
The Entity-Relationship Model(ER Diagram).pptxThe Entity-Relationship Model(ER Diagram).pptx
The Entity-Relationship Model(ER Diagram).pptxMANASINANDKISHORDEOR
 
5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...archanaece3
 
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...drjose256
 
electrical installation and maintenance.
electrical installation and maintenance.electrical installation and maintenance.
electrical installation and maintenance.benjamincojr
 
Adsorption (mass transfer operations 2) ppt
Adsorption (mass transfer operations 2) pptAdsorption (mass transfer operations 2) ppt
Adsorption (mass transfer operations 2) pptjigup7320
 
Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Ramkumar k
 

Recently uploaded (20)

Independent Solar-Powered Electric Vehicle Charging Station
Independent Solar-Powered Electric Vehicle Charging StationIndependent Solar-Powered Electric Vehicle Charging Station
Independent Solar-Powered Electric Vehicle Charging Station
 
Autodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptxAutodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptx
 
litvinenko_Henry_Intrusion_Hong-Kong_2024.pdf
litvinenko_Henry_Intrusion_Hong-Kong_2024.pdflitvinenko_Henry_Intrusion_Hong-Kong_2024.pdf
litvinenko_Henry_Intrusion_Hong-Kong_2024.pdf
 
Final DBMS Manual (2).pdf final lab manual
Final DBMS Manual (2).pdf final lab manualFinal DBMS Manual (2).pdf final lab manual
Final DBMS Manual (2).pdf final lab manual
 
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
 
Software Engineering Practical File Front Pages.pdf
Software Engineering Practical File Front Pages.pdfSoftware Engineering Practical File Front Pages.pdf
Software Engineering Practical File Front Pages.pdf
 
Maximizing Incident Investigation Efficacy in Oil & Gas: Techniques and Tools
Maximizing Incident Investigation Efficacy in Oil & Gas: Techniques and ToolsMaximizing Incident Investigation Efficacy in Oil & Gas: Techniques and Tools
Maximizing Incident Investigation Efficacy in Oil & Gas: Techniques and Tools
 
Worksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptxWorksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptx
 
Dynamo Scripts for Task IDs and Space Naming.pptx
Dynamo Scripts for Task IDs and Space Naming.pptxDynamo Scripts for Task IDs and Space Naming.pptx
Dynamo Scripts for Task IDs and Space Naming.pptx
 
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdfInvolute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
 
Seismic Hazard Assessment Software in Python by Prof. Dr. Costas Sachpazis
Seismic Hazard Assessment Software in Python by Prof. Dr. Costas SachpazisSeismic Hazard Assessment Software in Python by Prof. Dr. Costas Sachpazis
Seismic Hazard Assessment Software in Python by Prof. Dr. Costas Sachpazis
 
Diploma Engineering Drawing Qp-2024 Ece .pdf
Diploma Engineering Drawing Qp-2024 Ece .pdfDiploma Engineering Drawing Qp-2024 Ece .pdf
Diploma Engineering Drawing Qp-2024 Ece .pdf
 
Filters for Electromagnetic Compatibility Applications
Filters for Electromagnetic Compatibility ApplicationsFilters for Electromagnetic Compatibility Applications
Filters for Electromagnetic Compatibility Applications
 
Maher Othman Interior Design Portfolio..
Maher Othman Interior Design Portfolio..Maher Othman Interior Design Portfolio..
Maher Othman Interior Design Portfolio..
 
The Entity-Relationship Model(ER Diagram).pptx
The Entity-Relationship Model(ER Diagram).pptxThe Entity-Relationship Model(ER Diagram).pptx
The Entity-Relationship Model(ER Diagram).pptx
 
5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...
 
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
 
electrical installation and maintenance.
electrical installation and maintenance.electrical installation and maintenance.
electrical installation and maintenance.
 
Adsorption (mass transfer operations 2) ppt
Adsorption (mass transfer operations 2) pptAdsorption (mass transfer operations 2) ppt
Adsorption (mass transfer operations 2) ppt
 
Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)
 

Cloud computing_Applications and paradigams.pptx

  • 1. Unit 2 – Cloud Computing Applications and Paradigms 1
  • 2. Contents ■ Challenges for cloud computing. ■ Architectural styles for cloud applications. ■ Workflows - coordination of multiple activities. ■ Coordination based on a state machine model: The Zookeeper ■ The MapReduce programming model. ■ A case study: the GrepTheWeb application. ■ Clouds for science and engineering. ■ High performance computing on a cloud. 2
  • 3. Cloud applications ■ Cloud computing is very attractive to the users: ■ Economic reasons. ❖ low infrastructure investment. ❖ low cost - customers are only billed for resources used. 3
  • 4. ■ Convenience and performance. ❖ Developers enjoy the advantages of a just-in-time infrastructure; they are free to design an application without being concerned with the system where the application will run. ❖ Execution time of compute-intensive and data-intensive applications can, potentially, be reduced through parallelization. If an application can partition the workload in n segments and spawn n instances of itself, then the execution time could be reduced by a factor close to n. 4
  • 5. ■ Cloud computing is also beneficial for the providers of computing cycles - it typically leads to a higher level of resource utilization. 5
  • 6. Cloud applications (cont’d) ■ Ideal applications for cloud computing: ❖ Web services. : -Ex. Salesforce.com ❖ Database services.:- Ex.Google app engine ❖ Transaction-based service. :- The resource requirements of transaction-oriented services benefit from an elastic environment where resources are available when needed and where one pays only for the resources it consumes. 6
  • 7. ■ Not all applications are suitable for cloud computing: ❖ Applications with a complex workflow and multiple dependencies, as is often the case in high- performance computing. ❖ Applications which require intensive communication among concurrent instances. ❖ When the workload cannot be arbitrarily partitioned. 7
  • 8. 2.1 Challenges for cloud application development ■ Performance isolation - nearly impossible to reach in a real system, especially when the system is heavily loaded. ■ Reliability - major concern; server failures expected when a large number of servers cooperate for the computations. 8
  • 9. ■ Cloud infrastructure exhibits latency and bandwidth fluctuations which affect the application performance. ■ Performance considerations limit the amount of data logging; the ability to identify the source of unexpected results and errors is helped by frequent logging. 9
  • 10. 2.2 Architectural styles for cloud applications ■ Based on the client-server paradigm. ■ Stateless servers - view a client request as an independent transaction and respond to it; the client is not required to first establish a connection to the server. ■ Often clients and servers communicate using Remote Procedure Calls (RPCs). 10
  • 11. ■ Simple Object Access Protocol (SOAP) - application protocol for web applications; message format based on the XML. Uses TCP or UDP transport protocols. ■ Representational State Transfer (REST) - software architecture for distributed hypermedia systems. Supports client communication with stateless servers, it is platform independent, language independent, supports data caching, and can be used in the presence of firewalls. 11
  • 12. 2.3 Workflows: Coordination of multiple activities workflow ■ Many cloud applications require the completion of multiple interdependent tasks, the description of a complex activity involving such tasks is known as a workflow. Workflow models ■ Workflow models are abstractions revealing the most important properties of the entities participating in a workflow management system. 12
  • 13. Task ■ Task is the central concept in workflow modeling; a task is a unit of work to be performed on the cloud. Attributes of task 1. Name: A string of characters uniquely identifying the task. 2. Description: A natural language description of the task. 3. Actions: Modifications of the environment caused by the execution of the task. 4. Preconditions: Boolean expressions that must be true before the action(s) of the task can take place. 13
  • 14. 5. Post-conditions: Boolean expressions that must be true after the action(s) of the task take place. 5. Attributes: Provide indications of the type and quantity of resources necessary for the execution of the task, the actors in charge of the tasks, the security requirements, whether the task is reversible, and other task characteristics. 14
  • 15. 7. Exceptions: Provide information on how to handle abnormal events. ❖ The exceptions supported by a task consist of a list of <event, action> pairs. ❖ The exceptions included in the task exception list are called anticipated exceptions, as opposed to unanticipated exceptions. ❖ Events not included in the exception list trigger replanning. ❖ Replanning means restructuring of a process or redefinition of the relationship among various tasks. 15
  • 16. Different types of Tasks ■ A composite task : is a structure describing a subset of tasks and the order of their execution. ■ A primitive task : is one that cannot be decomposed into simpler tasks. ❖ A composite task inherits some properties from workflows; it consists of tasks and has one start symbol and possibly several end symbols. ❖ At the same time, a composite task inherits some properties from tasks; it has a name, preconditions, and post-conditions. 16
  • 17. ■ A routing task is a special-purpose task connecting two tasks in a workflow description. ■ The task that has just completed execution is called the predecessor task; ■ the one to be initiated next is called the successor task. ■ A routing task could trigger a sequential, concurrent, or iterative execution. Several types of routing task exist: 17
  • 18. ■ A fork routing task triggers execution of several successor tasks. Several semantics for this construct are possible: • All successor tasks are enabled. • Each successor task is associated with a condition. The conditions for all tasks are evaluated, and only the tasks with a true condition are enabled. • Each successor task is associated with a condition. The conditions for all tasks are evaluated, but the conditions are mutually exclusive and only one condition may be true. Thus, only one task is enabled. • Nondeterministic, k out of n > k successors are selected at random to be enabled. 18
  • 19. ■ A join routing task waits for completion of its predecessor tasks. There are several semantics for the join routing task: • The successor is enabled after all predecessors end. • The successor is enabled after k out of n > k predecessors end. • Iterative: The tasks between the fork and the join are executed repeatedly. 19
  • 20. ■ Process description - structure describing the tasks to be executed and the order of their execution. Resembles a flowchart. ■ Case - an instance of a process description. 20
  • 21. ■ State of a case at time t - defined in terms of tasks already completed at that time. ■ Events - cause transitions between states. ■ The life cycle of a workflow - creation, definition, verification, and enactment; similar to the life cycle of a traditional program (creation, compilation, and execution). 21
  • 22. 22
  • 23. Safety and liveness ■ Desirable properties of workflows. ■ Safety 🡪 nothing “bad” ever happens. ■ Liveness 🡪 something “good” will eventually happen. 23
  • 24. 24
  • 25. 25
  • 26. 26 (a) A process description that violates the liveness requirement. If task C is chosen after completion of B, the process will terminate after executing task G; if D is chosen, then F will never be instantiated, because it requires the completion of both C and E. The process will never terminate, because G requires completion of both D and F . (b) Tasks A and B need exclusive access to two resources r and q, and a deadlock may take place if the following sequence of events occurs. At time t1 task A acquires r, at time t2 task B acquires q and continues to run; then at time t3 task B attempts to acquire r and it blocks because r is under the control of A. Task A continues to run and at time t4 attempts to acquire q and it blocks because q is under the control of B.
  • 27. Deadlock avoidance solution ■ The deadlock illustrated in Figure 1 (b) can be avoided by requesting each task to acquire all resources at the same time. ■ The price to pay is underutilization of resources. ■ Indeed, the idle time of each resource increases under this scheme. 27
  • 28. Basic workflow patterns ■ Workflow patterns - the temporal relationship among the tasks of a process. ■ These patterns are classified in several categories: basic, advanced branching and synchronization, structural, state based, cancellation, and patterns involving multiple instances. 28
  • 29. 29
  • 30. ■ Sequence - several tasks have to be scheduled one after the completion of the other. ■ AND split - both tasks B and C are activated when task A terminates. 30
  • 31. ■ Synchronization - task C can only start after tasks A and B terminate. ■ XOR split - after completion of task A, either B or C can be activated. 31
  • 32. ■ XOR merge - task C is enabled when either A or B terminate. ■ OR split - after completion of task A one could activate either B, C, or both. 32
  • 33. ■ Multiple Merge - once task A terminates, B and C execute concurrently; when the first of them, say B, terminates, then D is activated; then, when C terminates, D is activated again. ■ Discriminator – wait for a no. of incoming branches to complete before activating the subsequent activity; then wait for the remaining branches to finish without taking any action until all of them have terminated. Next, resets itself. 33
  • 34. ■ N out of M join - barrier synchronization. Assuming that M tasks run concurrently, N (N<M) of them have to reach the barrier before the next task is enabled. In our example, any two out of the three tasks A, B, and C have to finish before E is enabled. ■ Deferred Choice - similar to the XOR split but the choice is not made explicitly; the run- time environment decides what branch to take. 34
  • 35. 2.4 Coordination based on a state machine model : ZooKeeper ■ Cloud elasticity 🡪 distribute computations and data across multiple systems; coordination among these systems is a critical function in a distributed environment. 35
  • 36. 36
  • 37. 37
  • 38. 38
  • 39. ZooKeeper ▪ Distributed coordination service for large-scale distributed systems. ▪ High throughput and low latency service. ▪ Open-source software written in Java with bindings for Java and C. 39
  • 40. ▪ The servers in the pack communicate and elect a leader. ▪ A database is replicated on each server; consistency of the replicas is maintained. ▪ A client connect to a single server using TCP, synchronizes its clock with the server, and sends requests, receives responses and watch events through a TCP connection. 40
  • 41. 41
  • 42. FIGURE : The ZooKeeper coordination service. (a) The service provides a single system image. Clients can connect to any server in the pack. (b) Functional model of the ZooKeeper service. The replicated database is accessed directly by read commands; write commands involve more intricate processing based on atomic broadcast. (c) Processing a write command: (1) A server receiving the command from a client forwards the command to the leader; (2) the leader uses atomic broadcast to reach consensus among all followers. 42
  • 43. Shared hierarchical namespace similar to a file system; znodes instead of inodes 43
  • 44. ZooKeeper service guarantees ■ Atomicity - a transaction either completes or fails. ■ Sequential consistency of updates - updates are applied strictly in the order they are received. ■ Single system image for the clients - a client receives the same response regardless of the server it connects to. ■ Persistence of updates - once applied, an update persists until it is overwritten by a client. ■ Reliability - the system is guaranteed to function correctly as long as the majority of servers function correctly. 44
  • 45. Zookeeper communication ■ Messaging layer 🡪 responsible for the election of a new leader when the current leader fails. ■ Messaging protocols use: ■ Packets - sequence of bytes sent through a FIFO channel. ■ Proposals - units of agreement. ■ Messages - sequence of bytes atomically broadcast to all servers. 45
  • 46. Zookeeper communication (cont’d) ■ A message is included into a proposal and it is agreed upon before it is delivered. ■ Proposals are agreed upon by exchanging packets with a quorum of servers, as required by the Paxos algorithm. 46
  • 47. Zookeeper communication (cont’d) ■ Messaging layer guarantees: ❖ Reliable delivery: if a message m is delivered to one server, it will be eventually delivered to all servers. ❖ Total order: if message m is delivered before message n to one server, it will be delivered before n to all servers. ❖ Causal order: if message n is sent after m has been delivered by the sender of n, then m must be ordered before n. 47
  • 48. Zookeeper API ■ The API is simple - consists of seven operations: ❖ Create - add a node at a given location on the tree. ❖ Delete - delete a node. ❖ Get data - read data from a node. ❖ Set data - write data to a node. ❖ Get children - retrieve a list of the children of the node. ❖ Synch - wait for the data to propagate. 48
  • 49. 2.6 The MapReduce programming model Advantage of cloud computing: Elasticity and load distribution ■ Elasticity 🡪 ability to use as many servers as necessary to optimally respond to cost and timing constraints of an application. ■ Load distribution 🡪 front-end system distributes the incoming requests to a number of back-end systems and attempts to balance the load among them. As the workload increases, new back-end systems are added to the pool. 49
  • 50. Load distribution 🡪 ■ How to divide the load ◻ Transaction processing systems 🡪 a front-end distributes the incoming transactions to a number of back-end systems. As the workload increases new back-end systems are added to the pool. ◻ For data-intensive batch applications two types of divisible workloads are possible: ■ modularly divisible 🡪 the workload partitioning is defined a priori. ■ arbitrarily divisible 🡪 the workload can be partitioned into an arbitrarily large number of smaller workloads of equal, or very close size. 50
  • 51. ■ Many applications in physics, biology, and other areas of computational science and engineering obey the arbitrarily divisible load sharing model. 51
  • 52. MapReduce philosophy 1. An application starts a master instance, M worker instances for the Map phase and later R worker instances for the Reduce phase. 1. The master instance partitions the input data in M segments. 1. Each map instance reads its input data segment and processes the data. 1. The results of the processing are stored on the local disks of the servers where the map instances run. 52
  • 53. 5. When all map instances have finished processing their data, the R reduce instances read the results of the first phase and merge the partial results. 5. The final results are written by the reduce instances to a shared storage server. 5. The master instance monitors the reduce instances and when all of them report task completion the application is terminated. 53
  • 54. 54
  • 55. 2.7 A Case study: The GrepTheWeb application ■ The application illustrates the means to ◻ create an on-demand infrastructure. ◻ run it on a massively distributed system in a manner that allows it to run in parallel and scale up and down, based on the number of users and the problem size. 55
  • 56. ■ GrepTheWeb ◻ Performs a search of a very large set of records to identify records that satisfy a regular expression. ◻ It is analogous to the Unix grep command. ◻ The source is a collection of document URLs produced by the Alexa Web Search, a software system that crawls the web every night. ◻ Uses message passing to trigger the activities of multiple controller threads which launch the application, initiate processing, shutdown the system, and create billing records. 56
  • 57. (a) The simplified workflow showing the inputs: - the regular expression. - the input records generated by the web crawler. - the user commands to report the current status and to terminate the processing. (b) The detailed workflow. The system is based on message passing between several queues; four controller threads periodically poll their associated input queues, retrieve messages, and carry out the required actions 57
  • 58. FIGURE ■ The organization of the GrepTheWeb application. ■ The application uses the Hadoop MapReduce software and four Amazon services: EC2, Simple DB, S3, and SQS. 58
  • 59. ■ The simplified workflow showing the two inputs the regular expression and the input records generated by the Web crawler. ■ A third type of input is the user commands to report the current status and to terminate the processing. 59
  • 60. (b) The detailed workflow 60
  • 61. ■ The system is based on message passing between several queues; four controller threads periodically poll their associated input queues, retrieve messages, and carry out the required actions. 61
  • 62. Steps of Workflow 1. The startup phase. ■ Creates several queues – launch, monitor, billing, and shutdown queues. ■ Starts the corresponding controller threads. Each thread periodically polls its input queue and, when a message is available, retrieves the message, parses it, and takes the required actions. 62
  • 63. 2. The processing phase. ■ This phase is triggered by a StartGrep user request; then a launch message is enqueued in the launch queue. ■ The launch controller thread picks up the message and executes the launch task; then, it updates the status and time stamps in the Amazon Simple DB domain. ■ Finally, it enqueues a message in the monitor queue and deletes the message from the launch queue. 63
  • 64. ■ The processing phase consists of the following steps: a. The launch task starts Amazon EC2 instances. It uses a Java Runtime Environment preinstalled Amazon Machine Image (AMI), deploys required Hadoop libraries, and starts a Hadoop Job (run Map/Reduce tasks). 64
  • 65. b. Hadoop runs map tasks on Amazon EC2 slave nodes in parallel. A map task takes files from Amazon S3, runs a regular expression, and writes the match results locally, along with a description of up to five matches. ■ Then the combine/reduce task combines and sorts the results and consolidates the output. c. Final results are stored on Amazon S3 in the output bucket. 65
  • 66. 3. The monitoring phase. ■ The monitor controller thread retrieves the message left at the beginning of the processing phase, validates the status/error in Amazon Simple DB, and executes the monitor task. ■ It updates the status in the Amazon Simple DB domain and enqueues messages in the shutdown and billing queues. 66
  • 67. ■ The monitor task checks for the Hadoop status periodically and updates the Simple DB items with status/error and the Amazon S3 output file. ■ Finally, it deletes the message from the monitor queue when the processing is completed. 67
  • 68. 4. The shutdown phase. ■ The shutdown controller thread retrieves the message from the shutdown queue and executes the shutdown task, which updates the status and time stamps in the Amazon Simple DB domain. ■ Finally, it deletes the message from the shutdown queue after processing. The shutdown phase consists of the following steps: 68
  • 69. a. The shutdown task kills the Hadoop processes, terminates the EC2 instances after getting EC2 topology information from Amazon Simple DB, and disposes of the infrastructure. b. The billing task gets the EC2 topology information, Simple DB usage, and S3 file and query input, calculates the charges, and passes the information to the billing service. 69
  • 70. 5. The cleanup phase. ■ Archives the Simple DB data with user info. 6. User interactions with the system. Get the status and output results. The GetStatus is applied to the service endpoint to get the status of the overall system (all controllers and Hadoop) and download the filtered results from Amazon S3 after completion. 70
  • 71. Conclusion : ■ This application illustrates the means to create an on- demand infrastructure and run it on a massively distributed system in a manner that allows it to run in parallel and scale up and down based on the number of users and the problem size. 71
  • 72. 2.8 Clouds for science and engineering ■ The generic problems in virtually all areas of science are: ❖Collection of experimental data. ❖Management of very large volumes of data. ❖Building and execution of models. ❖Integration of data and literature. ❖Documentation of the experiments. Ex: CSV File ❖Sharing the data with others; data preservation for a long periods of time. 72
  • 73. ■ All these activities require “big” data storage and systems capable to deliver abundant computing cycles. ■ Computing clouds are able to provide such resources and support collaborative environments. 73
  • 74. Online data discovery ■ Phases of data discovery in large scientific data sets: ◻ recognition of the information problem. ◻ generation of search queries using one or more search engines. ◻ evaluation of the search results. ◻ evaluation of the web documents. ◻ comparing information from different sources. 74
  • 75. Large scientific data sets: ◻ biomedical and genomic data from the National Center for Biotechnology Information (NCBI). ◻ astrophysics data from NASA. ◻ atmospheric data from the National Oceanic and Atmospheric Administration (NOAA) and the National Center for Atmospheric Research (NCAR). 75
  • 76. 2.9 High performance computing on a cloud ■ Comparative benchmark of EC2 and three supercomputers at the National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory. ■ NERSC has some 3,000 researchers and involves 400 projects based on some 600 codes. 76
  • 77. ■ Conclusion – communication-intensive applications are affected by the increased latency and lower bandwidth of the cloud. ■ The low latency and high bandwidth of the interconnection network of a supercomputer cannot be matched by a cloud. 77
  • 78. The systems used for the comparison with cloud computing are: 78
  • 79. SLC: Legacy applications on the cloud ■ Is it feasible to run legacy applications on a cloud? ■ Cirrus - a general platform for executing legacy Windows applications on the cloud. A Cirrus job - a prologue, commands, and parameters. The prologue sets up the running environment; the commands are sequences of shell scripts including Azure-storage- related commands to transfer data between Azure blob storage and the instance. ■ BLAST - a biology code which finds regions of local similarity between sequences; it compares nucleotide or protein sequences to sequence databases and calculates the statistical significance of matches; used to infer functional and evolutionary relationships between sequences and identify members of gene families. ■ AzureBLAST - a version of BLAST running on the Azure platform. 79
  • 81. Execution of loosely-coupled workloads using the Azure platform 81
  • 82. Social computing and digital content ■ Networks allowing researchers to share data and provide a virtual environment supporting remote execution of workflows are domain specific: ◻ MyExperiment for biology. ◻ nanoHub for nanoscience. ■ Volunteer computing - a large population of users donate resources such as CPU cycles and storage space for a specific project: ◻ Mersenne Prime Search ◻ SETI@Home, ◻ Folding@home, ◻ Storage@Home ◻ PlanetLab ■ Berkeley Open Infrastructure for Network Computing (BOINC) 🡪 middleware for a distributed infrastructure suitable for different applications. 82