SlideShare a Scribd company logo
Opportunities and Challenges for
Running Scientific Workflows
on the Cloud
Yong Zhao, Xubo Fei, Ioan Raicu, Shiyong Lu

Cyber-Enabled Distributed Computing and Knowledge Discovery
(CyberC), 2011 International Conference

Ying Lian
Computer Science, WSU
Overview
 INTRODUCTION
 OPPORTUNITIES
 CHALLENGES
 RESEARCH DIRECTIONS
 CONCLUSIONS
INTRODUCTION
 There is something in the air.
INTRODUCTION
 Cloud computing is gaining tremendous momentum in both
academia and industry.
 “Cloud Computing”: a large-scale distributed computing
paradigm that is driven by economies of scale, in which a
pool of abstracted, virtualized, dynamically-scalable,
managed computing power, storage, platforms, and
services are delivered on demand to external customers
over the Internet.
 Mostly applied to Web applications and business
applications. To support workflow applications
a link is missing
INTRODUCTION
 Manage and run workflow applications on the cloud
(especially data-intensive scientific workflows)
 Several Scientific workflow management systems
(SWFMSs) have been applied.
 Cloud Workflow: specification, execution, and
provenance tracking of scientific workflows, as well as
the management of data and computing resources to
enable the running of scientific workflows on the Cloud

 Following sections: Meaning, challenges, research
opportunities
OPPORTUNITIES
 Keywords: Infinite computing resource
 1. The scale of scientific problems that can be addressed
by scientific workflows is now greatly increased, which
was previously upbounded by the size of a dedicated
resource pool with limited resource sharing extension in
the form of virtual organizations.
 data size (e.g. GenBank double/9-12m )—vast storage
space
 complexities of the applications (e.g. protein simulation by
iterative algorithm with huge parameters) – massive
computing resources
OPPORTUNITIES
 2. The on-demand resource allocation
mechanism in Cloud has a number of
advantages over the traditional cluster/Grid
environments for scientific workflows:
 Improve resources utilization. Unequal numbers of
recourses are required for different stages.
 Faster turn-around time for end users: dynamic scale
out/in
 Enable new generation workflow: collaborative
scientific workflow. In which user interaction and
collaboration patterns are favored
OPPORTUNITIES
 3. Much bigger room for trade-off between
performance and cost.
 Spectrum of resource investment: from delicate
private resources, hybrid local & cloud, full outsourcing
on clouds
 Cloud computing bring the opportunities to improve
the performance/cost ratio
 But the optimization of this ratio and automatic tradeoff mechanism remain challenging.
CHANLLENGES
 Architectural challenges
 Integration challenges
 Computing challenges
 Data management challenges

 Language challenges
 Service management challenges
Architectural Challenges
User interface customizability and support

Reproducibility support
Heterogeneous and distributed services and
software tools integration
Heterogeneous and distributed data product
management
High-end computing support
Workflow monitoring and failure handling
Interoperability
Reference Architecture for SWFMSs
Deploy the architecture: solutions
Operation

Task
Management

Workflow
management

All_in_the_could

SWFMS running
out of the
Cloud

Not on a
batch-based
schedule

Presentation
Layer
deployed at a
client machine

SWFMS inside
the cloud, and
accessed via
Web browser

No concern of
vendor lock-in

Deploy
immediately
without
sequence

Suitable for ad
hoc domainspecific
requirement

Highly scalable:
Software as a
Service

SWFMS itself
cannot benefit
from the
scalability

Cost of storage
of provenance
& data
products

More
dependent on
Cloud platform

Cost;
Dependency;
Vendor lock-in
Integration Challenges
How to integrate scientific workflow systems with Cloud
infrastructure and resources ?

 Operation layer : Applications, services, and tools hosted in
the Cloud and the scheduling and management of a
workflow are outside the Cloud. (e.g. Google Map service
use ad hoc scripts and programs to glue the services
together)
 Task management layer: resource provisioning. (e.g. Nimbus)
 Workflow management layer: Debugging, monitoring, and
provenance tracking
 All in cloud: porting issue. Need a workflow engine at cloud
end, and web interface or thin client at user end
Language Challenges
 MapReduce: a widely used computing model, with two
key function, Map and Reduce. --White-Box

 SwiftScript serves as a general purpose coordination
language, where existing applications can be invoked
without modification. --Black-Box
Language Challenges
 Handle the mapping from input and output data into
logical structures.
 Support large-scale parallelism via either implicit
parallelism, or explicit declaratives.
 Support data partitioning and task partitioning.
 Require a scalable, reliable, and efficient runtime system
that can support Cloud-scale task scheduling and
dispatching, provide error recovery and fault tolerance.
Computing Challenges
 Workflow system may not be able to talk to Cloud
resources directly  middleware services needed.
(Nimbus or Falkon to handle the resource provisioning
and task dispatching)

 More complicated if consider: workflow resource
requirement, data dependencies, Cloud virtualization.
 A SWFMS will try to automatically recover when non-fatal
errors happen. Smart-return: detailed execution info be
logged, for workflow restart.
Data management challenges
 When data intensiveness increase, the management of
data resources and dataflow between the storage and
compute resources become the bottleneck.
 Data Locality: CPU cheaper, data inflate  location is the
most challenge, rather than the computational resources
 Combining compute and data management: need to
minimize the amount of data movement. Otherwise,
significant underutilization of raw resources will be yield.
 Provenance: derivation history of a data product. Tracking
across service providers, and across different abstraction
layers. Secure access is another missing now.
Service management challenges
 The engineering of the components of an SWFMS as
services:
 thousands of services developed and available for the
myExperiment project
 the LEAD system has developed a tool to wrap and convert
ordinary science applications into services

 The orchestration and invocation of services from an
SWFMS
 managing the large number of service instances
 data movements across different service instances
RESEARCH DIRECTIONS
 Emphasis on workflow reference architecture and direct
research effort to foregoing layers
 Great leap on Middleware development: resource
management, monitoring, messaging
 Many Task computing (MTC): preliminary applied in Grids
and supercomputer, expected to largely improved for
Cloud
 Scripting: mixture of semantics, combination of
application of services…
 Cost optimization: very challenging, but rewarding too
RESEARCH DIRECTIONS
 SWFMS security
 Access control: critical because of the natures of
clouds ( Dynamic, large data and service sharing)
 Information flow control: assure the scientific flow
related info propagated to an authorized end
 Secure electronic transaction protocol: pay-as-you-go
pricing model
CONCLUSIONS
 As more customers and applications migrate into Cloud,
the requirement to have workflow system to manage
complex tasks will become more urgent
 Now mash-up’s and MapReduce style task management
have been acting in place of a workflow system in the
Cloud
 The opportunities and challenges in bringing workflow
systems into the Cloud are discussed

 They identify key research directions in realizing scientific
workflows in Cloud environments
Thank You!

More Related Content

What's hot

Cloud Interoperability
Cloud InteroperabilityCloud Interoperability
Cloud Interoperability
Amir Mohtasebi
 
oracle-cloud-computing-wp-076373
oracle-cloud-computing-wp-076373oracle-cloud-computing-wp-076373
oracle-cloud-computing-wp-076373Prithvi Rajkumar
 
[IJET-V1I1P3] Author :R.Rajkamal, P.Rajenderan
[IJET-V1I1P3] Author :R.Rajkamal, P.Rajenderan[IJET-V1I1P3] Author :R.Rajkamal, P.Rajenderan
[IJET-V1I1P3] Author :R.Rajkamal, P.Rajenderan
IJET - International Journal of Engineering and Techniques
 
Microsoft Windows Azure
Microsoft Windows AzureMicrosoft Windows Azure
Microsoft Windows Azure
Intel Corporation
 
Methods Migration from On-premise to Cloud
Methods Migration from On-premise to CloudMethods Migration from On-premise to Cloud
Methods Migration from On-premise to Cloud
iosrjce
 
Whitepaper: Choose the cloud platform that beats the competition - Citrix Clo...
Whitepaper: Choose the cloud platform that beats the competition - Citrix Clo...Whitepaper: Choose the cloud platform that beats the competition - Citrix Clo...
Whitepaper: Choose the cloud platform that beats the competition - Citrix Clo...
Citrix
 
Cloud Computing Interoperability in Education
Cloud Computing Interoperability in EducationCloud Computing Interoperability in Education
Cloud Computing Interoperability in Education
sandra sukarieh
 
Zeller Edm Summit Agile Deployment Of Predictive Analytics
Zeller Edm Summit   Agile Deployment Of Predictive AnalyticsZeller Edm Summit   Agile Deployment Of Predictive Analytics
Zeller Edm Summit Agile Deployment Of Predictive AnalyticsRonald.Ramos
 
Fast and energy-efficient eNVM based memory organisation at L3-L1 layers for ...
Fast and energy-efficient eNVM based memory organisation at L3-L1 layers for ...Fast and energy-efficient eNVM based memory organisation at L3-L1 layers for ...
Fast and energy-efficient eNVM based memory organisation at L3-L1 layers for ...
Facultad de Informática UCM
 
Multi cloud PaaS
Multi cloud PaaSMulti cloud PaaS
Multi cloud PaaS
Fawaz Fernand PARAISO
 
Cloud Lock-in vs. Cloud Interoperability - Indicthreads cloud computing conf...
Cloud Lock-in vs. Cloud Interoperability  - Indicthreads cloud computing conf...Cloud Lock-in vs. Cloud Interoperability  - Indicthreads cloud computing conf...
Cloud Lock-in vs. Cloud Interoperability - Indicthreads cloud computing conf...
IndicThreads
 
The cloud ecosystem for the enterprise: Comparative analysis of the leading c...
The cloud ecosystem for the enterprise: Comparative analysis of the leading c...The cloud ecosystem for the enterprise: Comparative analysis of the leading c...
The cloud ecosystem for the enterprise: Comparative analysis of the leading c...
Skender Kollcaku
 
Paolo Merialdo, Cloud Computing and Virtualization: una introduzione
Paolo Merialdo, Cloud Computing and Virtualization: una introduzionePaolo Merialdo, Cloud Computing and Virtualization: una introduzione
Paolo Merialdo, Cloud Computing and Virtualization: una introduzione
InnovAction Lab
 
Cloud Resource Management
Cloud Resource ManagementCloud Resource Management
Cloud Resource Management
NASIRSAYYED4
 
Channel Models For Cloud Computing - REV 2 - (0.90))
Channel Models For Cloud Computing - REV 2 - (0.90))Channel Models For Cloud Computing - REV 2 - (0.90))
Channel Models For Cloud Computing - REV 2 - (0.90))
Lustratus REPAMA
 
Cloud interoperability
Cloud interoperabilityCloud interoperability
Cloud interoperability
gaurav jain
 
Assess enterprise applications for cloud migration
Assess enterprise applications for cloud migrationAssess enterprise applications for cloud migration
Assess enterprise applications for cloud migration
nanda1505
 

What's hot (19)

Cloud Interoperability
Cloud InteroperabilityCloud Interoperability
Cloud Interoperability
 
call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...
 
oracle-cloud-computing-wp-076373
oracle-cloud-computing-wp-076373oracle-cloud-computing-wp-076373
oracle-cloud-computing-wp-076373
 
[IJET-V1I1P3] Author :R.Rajkamal, P.Rajenderan
[IJET-V1I1P3] Author :R.Rajkamal, P.Rajenderan[IJET-V1I1P3] Author :R.Rajkamal, P.Rajenderan
[IJET-V1I1P3] Author :R.Rajkamal, P.Rajenderan
 
Microsoft Windows Azure
Microsoft Windows AzureMicrosoft Windows Azure
Microsoft Windows Azure
 
cloud
cloudcloud
cloud
 
Methods Migration from On-premise to Cloud
Methods Migration from On-premise to CloudMethods Migration from On-premise to Cloud
Methods Migration from On-premise to Cloud
 
Whitepaper: Choose the cloud platform that beats the competition - Citrix Clo...
Whitepaper: Choose the cloud platform that beats the competition - Citrix Clo...Whitepaper: Choose the cloud platform that beats the competition - Citrix Clo...
Whitepaper: Choose the cloud platform that beats the competition - Citrix Clo...
 
Cloud Computing Interoperability in Education
Cloud Computing Interoperability in EducationCloud Computing Interoperability in Education
Cloud Computing Interoperability in Education
 
Zeller Edm Summit Agile Deployment Of Predictive Analytics
Zeller Edm Summit   Agile Deployment Of Predictive AnalyticsZeller Edm Summit   Agile Deployment Of Predictive Analytics
Zeller Edm Summit Agile Deployment Of Predictive Analytics
 
Fast and energy-efficient eNVM based memory organisation at L3-L1 layers for ...
Fast and energy-efficient eNVM based memory organisation at L3-L1 layers for ...Fast and energy-efficient eNVM based memory organisation at L3-L1 layers for ...
Fast and energy-efficient eNVM based memory organisation at L3-L1 layers for ...
 
Multi cloud PaaS
Multi cloud PaaSMulti cloud PaaS
Multi cloud PaaS
 
Cloud Lock-in vs. Cloud Interoperability - Indicthreads cloud computing conf...
Cloud Lock-in vs. Cloud Interoperability  - Indicthreads cloud computing conf...Cloud Lock-in vs. Cloud Interoperability  - Indicthreads cloud computing conf...
Cloud Lock-in vs. Cloud Interoperability - Indicthreads cloud computing conf...
 
The cloud ecosystem for the enterprise: Comparative analysis of the leading c...
The cloud ecosystem for the enterprise: Comparative analysis of the leading c...The cloud ecosystem for the enterprise: Comparative analysis of the leading c...
The cloud ecosystem for the enterprise: Comparative analysis of the leading c...
 
Paolo Merialdo, Cloud Computing and Virtualization: una introduzione
Paolo Merialdo, Cloud Computing and Virtualization: una introduzionePaolo Merialdo, Cloud Computing and Virtualization: una introduzione
Paolo Merialdo, Cloud Computing and Virtualization: una introduzione
 
Cloud Resource Management
Cloud Resource ManagementCloud Resource Management
Cloud Resource Management
 
Channel Models For Cloud Computing - REV 2 - (0.90))
Channel Models For Cloud Computing - REV 2 - (0.90))Channel Models For Cloud Computing - REV 2 - (0.90))
Channel Models For Cloud Computing - REV 2 - (0.90))
 
Cloud interoperability
Cloud interoperabilityCloud interoperability
Cloud interoperability
 
Assess enterprise applications for cloud migration
Assess enterprise applications for cloud migrationAssess enterprise applications for cloud migration
Assess enterprise applications for cloud migration
 

Viewers also liked

Supporting bioinformatics applications with hybrid multi-cloud services
Supporting bioinformatics applications with hybrid multi-cloud servicesSupporting bioinformatics applications with hybrid multi-cloud services
Supporting bioinformatics applications with hybrid multi-cloud services
Ahmed Abdullah
 
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science Central
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science CentralCloud e-Genome: NGS Workflows on the Cloud Using e-Science Central
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science Central
Paolo Missier
 
Wolstencroft K - Workflows on the Cloud: scaling for national service
Wolstencroft K - Workflows on the Cloud: scaling for national serviceWolstencroft K - Workflows on the Cloud: scaling for national service
Wolstencroft K - Workflows on the Cloud: scaling for national service
Jan Aerts
 
Taverna workflows in the cloud
Taverna workflows in the cloudTaverna workflows in the cloud
Taverna workflows in the cloud
myGrid team
 
The Case For Docker In Multi-Cloud Enabled Bioinformatics Applications
The Case For Docker In Multi-Cloud Enabled Bioinformatics ApplicationsThe Case For Docker In Multi-Cloud Enabled Bioinformatics Applications
The Case For Docker In Multi-Cloud Enabled Bioinformatics Applications
Ahmed Abdullah
 
CloudFlow: Computational Cloud Services and Workflows for Agile Engineering
CloudFlow: Computational Cloud Services and Workflows for Agile EngineeringCloudFlow: Computational Cloud Services and Workflows for Agile Engineering
CloudFlow: Computational Cloud Services and Workflows for Agile Engineering
I4MS_eu
 
Automated Media Workflows in the Cloud (MED304) | AWS re:Invent 2013
Automated Media Workflows in the Cloud (MED304) | AWS re:Invent 2013Automated Media Workflows in the Cloud (MED304) | AWS re:Invent 2013
Automated Media Workflows in the Cloud (MED304) | AWS re:Invent 2013
Amazon Web Services
 
An optimized scientific workflow scheduling in cloud computing
An optimized scientific workflow scheduling in cloud computingAn optimized scientific workflow scheduling in cloud computing
An optimized scientific workflow scheduling in cloud computing
DIGVIJAY SHINDE
 
Cloud Workflows for Procurement
Cloud Workflows for ProcurementCloud Workflows for Procurement
Cloud Workflows for Procurement
Scatterwork GmbH
 
Auto-Scaling to Minimize Cost and Meet Application Deadlines in Cloud Workflows
Auto-Scaling to Minimize Cost and Meet Application Deadlines in Cloud WorkflowsAuto-Scaling to Minimize Cost and Meet Application Deadlines in Cloud Workflows
Auto-Scaling to Minimize Cost and Meet Application Deadlines in Cloud Workflows
mingtemp
 
Scaling wix with microservices architecture devoxx London 2015
Scaling wix with microservices architecture devoxx London 2015Scaling wix with microservices architecture devoxx London 2015
Scaling wix with microservices architecture devoxx London 2015
Aviran Mordo
 
Scalable Media Workflows on the Cloud
Scalable Media Workflows on the Cloud Scalable Media Workflows on the Cloud
Scalable Media Workflows on the Cloud
Amazon Web Services
 

Viewers also liked (12)

Supporting bioinformatics applications with hybrid multi-cloud services
Supporting bioinformatics applications with hybrid multi-cloud servicesSupporting bioinformatics applications with hybrid multi-cloud services
Supporting bioinformatics applications with hybrid multi-cloud services
 
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science Central
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science CentralCloud e-Genome: NGS Workflows on the Cloud Using e-Science Central
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science Central
 
Wolstencroft K - Workflows on the Cloud: scaling for national service
Wolstencroft K - Workflows on the Cloud: scaling for national serviceWolstencroft K - Workflows on the Cloud: scaling for national service
Wolstencroft K - Workflows on the Cloud: scaling for national service
 
Taverna workflows in the cloud
Taverna workflows in the cloudTaverna workflows in the cloud
Taverna workflows in the cloud
 
The Case For Docker In Multi-Cloud Enabled Bioinformatics Applications
The Case For Docker In Multi-Cloud Enabled Bioinformatics ApplicationsThe Case For Docker In Multi-Cloud Enabled Bioinformatics Applications
The Case For Docker In Multi-Cloud Enabled Bioinformatics Applications
 
CloudFlow: Computational Cloud Services and Workflows for Agile Engineering
CloudFlow: Computational Cloud Services and Workflows for Agile EngineeringCloudFlow: Computational Cloud Services and Workflows for Agile Engineering
CloudFlow: Computational Cloud Services and Workflows for Agile Engineering
 
Automated Media Workflows in the Cloud (MED304) | AWS re:Invent 2013
Automated Media Workflows in the Cloud (MED304) | AWS re:Invent 2013Automated Media Workflows in the Cloud (MED304) | AWS re:Invent 2013
Automated Media Workflows in the Cloud (MED304) | AWS re:Invent 2013
 
An optimized scientific workflow scheduling in cloud computing
An optimized scientific workflow scheduling in cloud computingAn optimized scientific workflow scheduling in cloud computing
An optimized scientific workflow scheduling in cloud computing
 
Cloud Workflows for Procurement
Cloud Workflows for ProcurementCloud Workflows for Procurement
Cloud Workflows for Procurement
 
Auto-Scaling to Minimize Cost and Meet Application Deadlines in Cloud Workflows
Auto-Scaling to Minimize Cost and Meet Application Deadlines in Cloud WorkflowsAuto-Scaling to Minimize Cost and Meet Application Deadlines in Cloud Workflows
Auto-Scaling to Minimize Cost and Meet Application Deadlines in Cloud Workflows
 
Scaling wix with microservices architecture devoxx London 2015
Scaling wix with microservices architecture devoxx London 2015Scaling wix with microservices architecture devoxx London 2015
Scaling wix with microservices architecture devoxx London 2015
 
Scalable Media Workflows on the Cloud
Scalable Media Workflows on the Cloud Scalable Media Workflows on the Cloud
Scalable Media Workflows on the Cloud
 

Similar to Opportunities and Challenges for Running Scientific Workflows on the Cloud

Cloud computing - dien toan dam may
Cloud computing - dien toan dam mayCloud computing - dien toan dam may
Cloud computing - dien toan dam mayNguyen Duong
 
Introduction to aneka cloud
Introduction to aneka cloudIntroduction to aneka cloud
Introduction to aneka cloud
ssuser84183f
 
Dynamic resource allocation using virtual machines for cloud computing enviro...
Dynamic resource allocation using virtual machines for cloud computing enviro...Dynamic resource allocation using virtual machines for cloud computing enviro...
Dynamic resource allocation using virtual machines for cloud computing enviro...
IEEEFINALYEARPROJECTS
 
JAVA 2013 IEEE PARALLELDISTRIBUTION PROJECT Dynamic resource allocation using...
JAVA 2013 IEEE PARALLELDISTRIBUTION PROJECT Dynamic resource allocation using...JAVA 2013 IEEE PARALLELDISTRIBUTION PROJECT Dynamic resource allocation using...
JAVA 2013 IEEE PARALLELDISTRIBUTION PROJECT Dynamic resource allocation using...
IEEEGLOBALSOFTTECHNOLOGIES
 
Scientific Cloud Computing: Present & Future
Scientific Cloud Computing: Present & FutureScientific Cloud Computing: Present & Future
Scientific Cloud Computing: Present & Future
stratuslab
 
云计算及其应用
云计算及其应用云计算及其应用
云计算及其应用
lantianlcdx
 
Challenges and solutions in Cloud computing for the Future Internet
Challenges and solutions in Cloud computing for the Future InternetChallenges and solutions in Cloud computing for the Future Internet
Challenges and solutions in Cloud computing for the Future InternetSOFIProject
 
IEEE Paper - A Study Of Cloud Computing Environments For High Performance App...
IEEE Paper - A Study Of Cloud Computing Environments For High Performance App...IEEE Paper - A Study Of Cloud Computing Environments For High Performance App...
IEEE Paper - A Study Of Cloud Computing Environments For High Performance App...
Angela Williams
 
Cloud scalability considerations
Cloud scalability considerationsCloud scalability considerations
Cloud scalability considerations
IJCSES Journal
 
Cloud strategy briefing 101
Cloud strategy briefing 101 Cloud strategy briefing 101
Cloud strategy briefing 101
Predrag Mitrovic
 
How to Transform Corporate IT into the Driver for Digital Transformation
How to Transform Corporate IT into the Driver for Digital TransformationHow to Transform Corporate IT into the Driver for Digital Transformation
How to Transform Corporate IT into the Driver for Digital Transformation
Enterprise Management Associates
 
Introduction To Cloud Computing By Beant Singh Duggal
Introduction To Cloud Computing By Beant Singh DuggalIntroduction To Cloud Computing By Beant Singh Duggal
Introduction To Cloud Computing By Beant Singh Duggal
Beantsingh
 
02_Cloud-Intro.pdf cloud introduction introduction
02_Cloud-Intro.pdf cloud introduction introduction02_Cloud-Intro.pdf cloud introduction introduction
02_Cloud-Intro.pdf cloud introduction introduction
AslamHossain30
 
Cloud computing
Cloud computingCloud computing
Cloud computingmikerrr
 
Cloud Computing By Pankaj Sharma
Cloud Computing By Pankaj SharmaCloud Computing By Pankaj Sharma
Cloud Computing By Pankaj Sharma
Ranjan Kumar
 
Unit 1
Unit 1Unit 1
Unit 1
Ravi Kumar
 
Cloud Computing
Cloud ComputingCloud Computing
Cloud Computing
SaiRamdev3
 
cloud computing based its solutions term paper
cloud computing based its solutions term papercloud computing based its solutions term paper
cloud computing based its solutions term paper
Shashi Bhushan
 
Cloud computing managing
Cloud computing managingCloud computing managing
Cloud computing managing
Universiti Putra Malaysia
 

Similar to Opportunities and Challenges for Running Scientific Workflows on the Cloud (20)

Cloud computing - dien toan dam may
Cloud computing - dien toan dam mayCloud computing - dien toan dam may
Cloud computing - dien toan dam may
 
F1034047
F1034047F1034047
F1034047
 
Introduction to aneka cloud
Introduction to aneka cloudIntroduction to aneka cloud
Introduction to aneka cloud
 
Dynamic resource allocation using virtual machines for cloud computing enviro...
Dynamic resource allocation using virtual machines for cloud computing enviro...Dynamic resource allocation using virtual machines for cloud computing enviro...
Dynamic resource allocation using virtual machines for cloud computing enviro...
 
JAVA 2013 IEEE PARALLELDISTRIBUTION PROJECT Dynamic resource allocation using...
JAVA 2013 IEEE PARALLELDISTRIBUTION PROJECT Dynamic resource allocation using...JAVA 2013 IEEE PARALLELDISTRIBUTION PROJECT Dynamic resource allocation using...
JAVA 2013 IEEE PARALLELDISTRIBUTION PROJECT Dynamic resource allocation using...
 
Scientific Cloud Computing: Present & Future
Scientific Cloud Computing: Present & FutureScientific Cloud Computing: Present & Future
Scientific Cloud Computing: Present & Future
 
云计算及其应用
云计算及其应用云计算及其应用
云计算及其应用
 
Challenges and solutions in Cloud computing for the Future Internet
Challenges and solutions in Cloud computing for the Future InternetChallenges and solutions in Cloud computing for the Future Internet
Challenges and solutions in Cloud computing for the Future Internet
 
IEEE Paper - A Study Of Cloud Computing Environments For High Performance App...
IEEE Paper - A Study Of Cloud Computing Environments For High Performance App...IEEE Paper - A Study Of Cloud Computing Environments For High Performance App...
IEEE Paper - A Study Of Cloud Computing Environments For High Performance App...
 
Cloud scalability considerations
Cloud scalability considerationsCloud scalability considerations
Cloud scalability considerations
 
Cloud strategy briefing 101
Cloud strategy briefing 101 Cloud strategy briefing 101
Cloud strategy briefing 101
 
How to Transform Corporate IT into the Driver for Digital Transformation
How to Transform Corporate IT into the Driver for Digital TransformationHow to Transform Corporate IT into the Driver for Digital Transformation
How to Transform Corporate IT into the Driver for Digital Transformation
 
Introduction To Cloud Computing By Beant Singh Duggal
Introduction To Cloud Computing By Beant Singh DuggalIntroduction To Cloud Computing By Beant Singh Duggal
Introduction To Cloud Computing By Beant Singh Duggal
 
02_Cloud-Intro.pdf cloud introduction introduction
02_Cloud-Intro.pdf cloud introduction introduction02_Cloud-Intro.pdf cloud introduction introduction
02_Cloud-Intro.pdf cloud introduction introduction
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
Cloud Computing By Pankaj Sharma
Cloud Computing By Pankaj SharmaCloud Computing By Pankaj Sharma
Cloud Computing By Pankaj Sharma
 
Unit 1
Unit 1Unit 1
Unit 1
 
Cloud Computing
Cloud ComputingCloud Computing
Cloud Computing
 
cloud computing based its solutions term paper
cloud computing based its solutions term papercloud computing based its solutions term paper
cloud computing based its solutions term paper
 
Cloud computing managing
Cloud computing managingCloud computing managing
Cloud computing managing
 

Recently uploaded

Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 

Recently uploaded (20)

Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 

Opportunities and Challenges for Running Scientific Workflows on the Cloud

  • 1. Opportunities and Challenges for Running Scientific Workflows on the Cloud Yong Zhao, Xubo Fei, Ioan Raicu, Shiyong Lu Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), 2011 International Conference Ying Lian Computer Science, WSU
  • 2. Overview  INTRODUCTION  OPPORTUNITIES  CHALLENGES  RESEARCH DIRECTIONS  CONCLUSIONS
  • 3. INTRODUCTION  There is something in the air.
  • 4. INTRODUCTION  Cloud computing is gaining tremendous momentum in both academia and industry.  “Cloud Computing”: a large-scale distributed computing paradigm that is driven by economies of scale, in which a pool of abstracted, virtualized, dynamically-scalable, managed computing power, storage, platforms, and services are delivered on demand to external customers over the Internet.  Mostly applied to Web applications and business applications. To support workflow applications a link is missing
  • 5. INTRODUCTION  Manage and run workflow applications on the cloud (especially data-intensive scientific workflows)  Several Scientific workflow management systems (SWFMSs) have been applied.  Cloud Workflow: specification, execution, and provenance tracking of scientific workflows, as well as the management of data and computing resources to enable the running of scientific workflows on the Cloud  Following sections: Meaning, challenges, research opportunities
  • 6. OPPORTUNITIES  Keywords: Infinite computing resource  1. The scale of scientific problems that can be addressed by scientific workflows is now greatly increased, which was previously upbounded by the size of a dedicated resource pool with limited resource sharing extension in the form of virtual organizations.  data size (e.g. GenBank double/9-12m )—vast storage space  complexities of the applications (e.g. protein simulation by iterative algorithm with huge parameters) – massive computing resources
  • 7. OPPORTUNITIES  2. The on-demand resource allocation mechanism in Cloud has a number of advantages over the traditional cluster/Grid environments for scientific workflows:  Improve resources utilization. Unequal numbers of recourses are required for different stages.  Faster turn-around time for end users: dynamic scale out/in  Enable new generation workflow: collaborative scientific workflow. In which user interaction and collaboration patterns are favored
  • 8. OPPORTUNITIES  3. Much bigger room for trade-off between performance and cost.  Spectrum of resource investment: from delicate private resources, hybrid local & cloud, full outsourcing on clouds  Cloud computing bring the opportunities to improve the performance/cost ratio  But the optimization of this ratio and automatic tradeoff mechanism remain challenging.
  • 9. CHANLLENGES  Architectural challenges  Integration challenges  Computing challenges  Data management challenges  Language challenges  Service management challenges
  • 10. Architectural Challenges User interface customizability and support Reproducibility support Heterogeneous and distributed services and software tools integration Heterogeneous and distributed data product management High-end computing support Workflow monitoring and failure handling Interoperability
  • 12. Deploy the architecture: solutions Operation Task Management Workflow management All_in_the_could SWFMS running out of the Cloud Not on a batch-based schedule Presentation Layer deployed at a client machine SWFMS inside the cloud, and accessed via Web browser No concern of vendor lock-in Deploy immediately without sequence Suitable for ad hoc domainspecific requirement Highly scalable: Software as a Service SWFMS itself cannot benefit from the scalability Cost of storage of provenance & data products More dependent on Cloud platform Cost; Dependency; Vendor lock-in
  • 13. Integration Challenges How to integrate scientific workflow systems with Cloud infrastructure and resources ?  Operation layer : Applications, services, and tools hosted in the Cloud and the scheduling and management of a workflow are outside the Cloud. (e.g. Google Map service use ad hoc scripts and programs to glue the services together)  Task management layer: resource provisioning. (e.g. Nimbus)  Workflow management layer: Debugging, monitoring, and provenance tracking  All in cloud: porting issue. Need a workflow engine at cloud end, and web interface or thin client at user end
  • 14. Language Challenges  MapReduce: a widely used computing model, with two key function, Map and Reduce. --White-Box  SwiftScript serves as a general purpose coordination language, where existing applications can be invoked without modification. --Black-Box
  • 15. Language Challenges  Handle the mapping from input and output data into logical structures.  Support large-scale parallelism via either implicit parallelism, or explicit declaratives.  Support data partitioning and task partitioning.  Require a scalable, reliable, and efficient runtime system that can support Cloud-scale task scheduling and dispatching, provide error recovery and fault tolerance.
  • 16. Computing Challenges  Workflow system may not be able to talk to Cloud resources directly  middleware services needed. (Nimbus or Falkon to handle the resource provisioning and task dispatching)  More complicated if consider: workflow resource requirement, data dependencies, Cloud virtualization.  A SWFMS will try to automatically recover when non-fatal errors happen. Smart-return: detailed execution info be logged, for workflow restart.
  • 17. Data management challenges  When data intensiveness increase, the management of data resources and dataflow between the storage and compute resources become the bottleneck.  Data Locality: CPU cheaper, data inflate  location is the most challenge, rather than the computational resources  Combining compute and data management: need to minimize the amount of data movement. Otherwise, significant underutilization of raw resources will be yield.  Provenance: derivation history of a data product. Tracking across service providers, and across different abstraction layers. Secure access is another missing now.
  • 18. Service management challenges  The engineering of the components of an SWFMS as services:  thousands of services developed and available for the myExperiment project  the LEAD system has developed a tool to wrap and convert ordinary science applications into services  The orchestration and invocation of services from an SWFMS  managing the large number of service instances  data movements across different service instances
  • 19. RESEARCH DIRECTIONS  Emphasis on workflow reference architecture and direct research effort to foregoing layers  Great leap on Middleware development: resource management, monitoring, messaging  Many Task computing (MTC): preliminary applied in Grids and supercomputer, expected to largely improved for Cloud  Scripting: mixture of semantics, combination of application of services…  Cost optimization: very challenging, but rewarding too
  • 20. RESEARCH DIRECTIONS  SWFMS security  Access control: critical because of the natures of clouds ( Dynamic, large data and service sharing)  Information flow control: assure the scientific flow related info propagated to an authorized end  Secure electronic transaction protocol: pay-as-you-go pricing model
  • 21. CONCLUSIONS  As more customers and applications migrate into Cloud, the requirement to have workflow system to manage complex tasks will become more urgent  Now mash-up’s and MapReduce style task management have been acting in place of a workflow system in the Cloud  The opportunities and challenges in bringing workflow systems into the Cloud are discussed  They identify key research directions in realizing scientific workflows in Cloud environments

Editor's Notes

  1. Cloud Computing has proven to be one of the great disruptive technologies of our time, and the effects of its increasing adoption and maturation will ripple out.Cloud Computing is here to stay, and as developers become more aware of the immense potential.