Csc8710 001 winter2014-mohammed_shahnawazali-ff2687_presentation_1


Published on


Published in: Education, Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • This template can be used as a starter file for a photo album.
  • Csc8710 001 winter2014-mohammed_shahnawazali-ff2687_presentation_1

    1. 1. Opportunities and Challenges for running Scientific Workflows on the Cloud CSC 8710-001 – Presentation I Mohammed Shahnawaz Ali
    2. 2. Executive Summary Cloud computing has been mentioned over the recent years in relation to services or infrastructural resources, which can be contracted over a network, endorsing the idea of renting infrastructure instead of buying it. Hence, cloud computing infrastructures enables companies to cut costs by outsourcing/offloading computations on-demand, thereby gaining tremendous momentum in both academia and industry. The application of cloud computing, however, has mostly focused on Web applications and business applications; while the recognition of using cloud computing to support large-scale workflows, especially data intensive scientific workflows on the cloud is still largely overlooked. 1/29/2014 CSC 8710 - Presentation I 2
    3. 3. Executive Summary (cont’d) The paper coins the term “Cloud Workflow”, to refer to the specification, execution, provenance tracking of large-scale scientific workflows, as well as the management of data and computing resources to enable the execution of scientific workflows on the Cloud. The paper analyzes: 1. Why there has been such a gap between the two technologies, 2. What it means to bring Cloud and workflow together; 3. What are the key challenges in running Cloud workflow, 4. What are research opportunities in realizing workflows on the Cloud. 1/29/2014 CSC 8710 - Presentation I 3
    4. 4. Introduction. Cloud – The Origin The term “cloud” has its origins in network diagrams that represented the internet, or various parts of it, as schematic clouds. The term “Cloud computing” was defined for what happens when applications and services are moved into the internet “cloud.” Cloud computing is not something that suddenly appeared overnight; in some form it may trace back to a time when computer systems remotely time-shared computing resources and applications. More currently though, cloud computing refers to the many different types of services and applications being delivered in the internet cloud, and the fact that, in many cases, the devices used to access these services and applications do not require any special applications. 1/29/2014 CSC 8710 - Presentation I 4
    5. 5. Introduction. Cloud – The Representation Cloud computing represents : 1. a different way to architect and remotely manage computing resources. 2. network-based services, which appear to be provided by real server hardware, and are in fact served up by virtual hardware, simulated by software running on one or more real machines Cloud computing offerings today are suitable to: 1. host enterprise architectures and provide clear benefit to corporations by providing capabilities complementary to what they have, 2. help elastically scale enterprise architectures. 1/29/2014 CSC 8710 - Presentation I 5
    6. 6. Introduction. Cloud – The Metaphor Commonly used, the term "the cloud" is essentially a metaphor for the Internet. Marketers have further popularized the phrase "in the cloud" to refer to software, platforms and infrastructure that are sold as a service i.e. remotely through the Internet. Typically, the seller has actual energy-consuming servers which host products and services from a remote location, so end-users don't have to; they can simply log on to the network without installing anything. According to the field of interest, software, service or infrastructure providers highlight different aspects. 1/29/2014 CSC 8710 - Presentation I 6
    7. 7. Introduction. Cloud – The Offerings Some notable companies delivering services from the cloud include: 1. Google — Has a private cloud that it uses for delivering many different services to its users, including email access, document applications, text translations, maps, web analytics, and much more 2. Microsoft — Has Microsoft SharePoint online service that allows for content and business intelligence tools to be moved into the cloud, and Microsoft currently makes its office applications available in a cloud. 3. — Runs its application set for its customers in a cloud, and its and products provide developers with platforms to build customized cloud services. 1/29/2014 CSC 8710 - Presentation I 7
    8. 8. History • 2006: August 24, 2006 conceivably goes down as the birthday of Cloud Computing, as it was on this day that Amazon made the test version of its Elastic Computing Cloud (EC2) public. This offer, providing flexible IT resources (computing capacity), marked a definitive milestone in dynamic business relations between IT users and providers. • 2007: The term first became popular in 2007, to which the first entry in the English Wikipedia from March 3, 2007 attests, which, again significantly, contained a reference to utility computing. • 2008: In 2008, there was a glut of active parties in the increasingly popular field of Cloud Computing. • Today, Cloud Computing generates over 10.3 million matches on Google. 1/29/2014 CSC 8710 - Presentation I 8
    9. 9. Cloud Computing 1/29/2014 CSC 8710 - Presentation I 9
    10. 10. Cloud Computing – The Characteristics Cloud computing has a variety of characteristics, with the main ones being: 1. Shared Infrastructure — Uses a virtualized software model, enabling the sharing of physical services, storage, and networking capabilities. The cloud infrastructure, regardless of deployment model, seeks to make the most of the available infrastructure across a number of users. 2. Dynamic Provisioning — Allows for the provision of services based on current demand requirements. This is done automatically using software automation, enabling the expansion and contraction of service capability, as needed. This dynamic scaling needs to be done while maintaining high levels of reliability and security. 1/29/2014 CSC 8710 - Presentation I 10
    11. 11. Cloud Computing – The Characteristics (cont’d) 3. Network Access — Needs to be accessed across the internet from a broad range of devices such as PCs, laptops, and mobile devices, using standards-based APIs (for example, ones based on HTTP). Deployments of services in the cloud include everything from using business applications to the latest application on the newest smartphones. 4. Managed Metering — Uses metering for managing and optimizing the service and to provide reporting and billing information. In this way, consumers are billed for services according to how much they have actually used during the billing period. 1/29/2014 CSC 8710 - Presentation I 11
    12. 12. Cloud Computing – The Service Models Once a cloud is established, how its cloud computing services are deployed in terms of business models can differ depending on requirements. The primary service models being deployed are commonly known as: 1. Software as a Service (SaaS) — Consumers purchase the ability to access and use an application or service that is hosted in the cloud. A benchmark example of this is, as discussed previously, where necessary information for the interaction between the consumer and the service is hosted as part of the service in the cloud. 2. Platform as a Service (PaaS) — Consumers purchase access to the platforms, enabling them to deploy their own software and applications in the cloud. The operating systems and network access are not managed by the consumer, and there might be constraints as to which applications can be deployed. 1/29/2014 CSC 8710 - Presentation I 12
    13. 13. Cloud Computing – The Service Models (cont’d) 3. Infrastructure as a Service (IaaS) — Consumers control and manage the systems in terms of the operating systems, applications, storage, and network connectivity, but do not themselves control the cloud infrastructure. 4. Communications as a Service (CaaS) — is a model used to describe hosted IP telephony services. Along with the move to CaaS is a shift to more IP-centric communications and more SIP trunking deployments. With IP and SIP in place, it can be as easy to have the PBX in the cloud as it is to have it on the premise. In this context, CaaS could be seen as a subset of SaaS 1/29/2014 CSC 8710 - Presentation I 13
    14. 14. Cloud Computing – The Deployment Models Deploying cloud computing can differ depending on requirements, and the following four deployment models have been identified, each with specific characteristics that support the needs of the services and users of the clouds in particular ways: 1. Private Cloud — The cloud infrastructure has been deployed, and is maintained and operated for a specific organization. The operation may be in-house or with a third party on the premises. 2. Community Cloud — The cloud infrastructure is shared among a number of organizations with similar interests and requirements. This may help limit the capital expenditure costs for its establishment as the costs are shared among the organizations. The operation may be in-house or with a third party on the premises. 1/29/2014 CSC 8710 - Presentation I 14
    15. 15. Cloud Computing – The Deployment Models (cont’d) 3. Public Cloud — The cloud infrastructure is available to the public on a commercial basis by a cloud service provider. This enables a consumer to develop and deploy a service in the cloud with very little financial outlay compared to the capital expenditure requirements normally associated with other deployment options. 4. Hybrid Cloud — The cloud infrastructure consists of a number of clouds of any type, but the clouds have the ability through their interfaces to allow data and/or applications to be moved from one cloud to another. This can be a combination of private and public clouds that support the requirement to retain some data in an organization, and also the need to offer services in the cloud. 1/29/2014 CSC 8710 - Presentation I 15
    16. 16. Cloud Computing – The Benefits The following are some of the possible benefits for those who offer cloud computing-based services and applications: 1. Cost Savings — Companies can reduce their capital expenditures and use operational expenditures for increasing their computing capabilities. This is a lower barrier to entry and also requires fewer in-house IT resources to provide system support. 2. Scalability/Flexibility — Companies can start with a small deployment and grow to a large deployment fairly rapidly, and then scale back if necessary. Also, the flexibility of cloud computing allows companies to use extra resources at peak times, enabling them to satisfy consumer demands. 1/29/2014 CSC 8710 - Presentation I 16
    17. 17. Cloud Computing – The Benefits (cont’d) 3. Reliability — Services using multiple redundant sites can support business continuity and disaster recovery. 4. Maintenance — Cloud service providers do the system maintenance, and access is through APIs that do not require application installations onto PCs, thus further reducing maintenance requirements. 5. Mobile Accessible — Mobile workers have increased productivity due to systems accessible in an infrastructure available from anywhere 1/29/2014 CSC 8710 - Presentation I 17
    18. 18. Scientific Workflows 1/29/2014 CSC 8710 - Presentation I 18
    19. 19. Scientific Workflows • Scientific Workflows are an amalgamation of scientific problem-solving and traditional workflow techniques. • These are another class of workflows, in addition to the business workflows, that emerge in sophisticated scientific problem-solving environments and applications viz., climate modeling, structural biology and chemistry, medical surgery or disaster recovery simulation. • Compared with business workflows, scientific workflow has special features such as computation, data or transaction intensity, less human interaction, and a large number of activities. 1/29/2014 CSC 8710 - Presentation I 19
    20. 20. Scientific Workflow Management System The reference architecture for SWFMS consists of four logical layers, seven major functional subsystems, and six interfaces: 1. Operational Layer: consists of a wide range of heterogeneous and distributed data sources, software tools, services, and their operational environments, including high end computing environments. 2. Task Management Layer: consists of three subsystems: Data Product Management, Provenance Management, and Task Management. 3. Workflow Management Layer: consists of Workflow Engine and Workflow Monitoring. 4. Presentation Layer: consists of the Workflow Design subsystem and the Presentation and Visualization subsystem 1/29/2014 CSC 8710 - Presentation I 20
    21. 21. Scientific Workflows – On The Cloud Today Cloud computing have been widely accepted and applied to Web applications and business applications. However, the cloud capabilities have not been successfully extended to execute and manage workflow applications, especially data-intensive scientific workflows. The current state of workflow organization on the Cloud has been either: 1. static predefined pipelines based on batch style scripts or graphs based on the MapReduce programming model 2. ad hoc mash-up’s that are connected together with, again, scripts that parse the output of one web application and feed into another. 1/29/2014 CSC 8710 - Presentation I 21
    22. 22. Scientific Workflows – On The Cloud Today (cont’d) Several scientific workflow management systems (SWFMSs) have been successfully applied over a number of execution environments viz., local hosts, clusters/grids, and supercomputers. However, Cloud computing provides a paradigm-shifting utilityoriented computing model in terms of the unprecedented size of datacenter-level resource pool and the on-demand resource provisioning mechanism, enabling scientific workflow solutions capable of addressing peta-scale scientific problems. 1/29/2014 CSC 8710 - Presentation I 22
    23. 23. Cloud Workflow – Bringing them together The term Cloud Workflow is constructed to bring the terms Cloud and Scientific Workflows together. It refers to the following attributes of scientific workflows 1. specification, 2. execution, 3. provenance tracking along with management of data and computing resources to enable the running of scientific workflows on the Cloud. 1/29/2014 CSC 8710 - Presentation I 23
    24. 24. Opportunities 1/29/2014 CSC 8710 - Presentation I 24
    25. 25. Cloud Workflow – The Opportunities Clouds provide a multiplicity of opportunities that are more technological in nature, and that these opportunities stem, primarily, from the extensive use of service-oriented architectures and virtualization in clouds Scalability: • The scale of scientific problems: • that can be addressed by scientific workflows is now greatly increased, which was previously limited by the size of a dedicated resource pool. • is reflected not only on the data sizes that scientific applications need to handle, but also on the complexities of the applications themselves. • Cloud platforms can offer vast amount of storage space as well as computing resources for applications across multiple disciplines including physics, earth science, and medicine, allowing scientific discoveries to be carried out in an unprecedented scale. 1/29/2014 CSC 8710 - Presentation I 25
    26. 26. Cloud Workflow – The Opportunities Dynamic resource allocation: • By allocating the resources only when they are needed, it presents various advantages including: • Optimum resource utilization. • Improved end user experience. • Collaborative batch based scientific workflows. 1/29/2014 CSC 8710 - Presentation I 26
    27. 27. Cloud Workflow – The Opportunities Relinquish allocated resources: • Cloud allows users to return resources on-demand • Enables workflow systems to easily grow and shrink the available resource pool as the needs of the workflow change over time • Closely match the needs of the application by acquiring or releasing resources for optimal usage 1/29/2014 CSC 8710 - Presentation I 27
    28. 28. Cloud Workflow – The Opportunities Performance to Cost Trade/off: • Cloud computing provides a much larger room for the tradeoff between performance and cost. • The spectrum of resource investment now ranges from: • dedicated private resources, • hybrid resource pool combining local resource and remote clouds, • full outsourcing of computing and storage to public Clouds. • Cloud Computing not only provides the potential of solving larger-scale scientific problems, but also brings the opportunity to improve the performance/cost ratio. 1/29/2014 CSC 8710 - Presentation I 28
    29. 29. Cloud Workflow – The Opportunities Heterogeneous Applications Support: • Clouds and their use of virtualization technology makes different heterogeneous applications much easier to run together. • Virtualization enables the environment to be customized to suit the application. • Environment with Operating System, applications and their configurations can be bundled up as a virtual machine image and redeployed on a cloud to run the workflow. 1/29/2014 CSC 8710 - Presentation I 29
    30. 30. Cloud Workflow – The Opportunities Resource Provisioning: • Instead of delegating allocation to the resource manager, the user directly provisions the resources required and schedules their computations using a user-controlled scheduler. • Provisioning model is ideal for workflows and other looselycoupled applications because it enables the application to allocate a resource once and use it to execute many tasks. • Reduces the total scheduling overhead which, in turn, can dramatically improve workflow performance 1/29/2014 CSC 8710 - Presentation I 30
    31. 31. Cloud Workflow – The Opportunities Provenance and Re-Imaging: • Virtualization allows one to capture the exact environment that was used to perform a computation, including all of the software and configuration used in that environment. • Virtual machine image can be stored along with the provenance of the workflow. • Redeploy the virtual machine image to create exactly the same environment that was used to run the original experiment. 1/29/2014 CSC 8710 - Presentation I 31
    32. 32. Challenges 1/29/2014 CSC 8710 - Presentation I 32
    33. 33. Cloud Workflow – The Challenges Despite the advantages and opportunities we can seek in Cloud computing for scientific workflows, there are many major obstacles to the adaptation and running of scientific workflows on the Cloud. Architectural Challenge: The following seven are key architectural requirements for an SWFMS: 1. User interface customizability and user interaction support. 2. Reproducibility support 3. Heterogeneous and distributed services and software tools integration. 4. Heterogeneous and distributed data product management. 5. High-end computing support. 6. Workflow monitoring and failure handling. 7. Interoperability. 1/29/2014 CSC 8710 - Presentation I 33
    34. 34. Cloud Workflow – The Challenges Architectural Challenge (cont’d): There are four possible solutions for deploying the reference architecture in a Cloud computing environment: 1. Operational-Layer-in-the-Cloud: only the Operational Layer is deployed in the Cloud with an SWFMS running out of the Cloud. 2. Task-Management-Layer-in-the-Cloud: both the Operational Layer and the Task Management Layer are deployed in the Cloud. 3. Workflow-Management-Layer-in-the-Cloud: the Operational Layer, the Task Management Layer, and the Workflow Management Layer are deployed in the Cloud with the Presentation Layer deployed at a client machine. 4. All-in-the-Cloud: The whole SWFMS is deployed inside the Cloud and accessible via a Web browser. 1/29/2014 CSC 8710 - Presentation I 34
    35. 35. Cloud Workflow – The Challenges Integration Challenge: The integration problem includes the following: 1. In the operational-layer-in-the-Cloud approach, we treat applications, services, and tools hosted in the Cloud as task units in a workflow, the scheduling and management of a workflow are mostly outside the Cloud, where these task units are invoked as they are scheduled to execute. 2. Once we decide to get task dispatching and scheduling into the Cloud, resource provisioning becomes the next issue. 3. The uncapped resources requested by a workflow comes at a cost. 4. Debugging, monitoring, and provenance tracking for a workflow can be even more difficult in the Cloud, since resources are usually dynamically assigned and based on virtual machine instances, the environment that a task is executed on could be destroyed right after the task is finished, and assigned to a complete different user and task. 5. Porting an SWFMS into the Cloud is also a concern, which usually involves wrapping up an SWFMS as a Cloud service. 1/29/2014 CSC 8710 - Presentation I 35
    36. 36. Cloud Workflow – The Challenges Language Challenge: Language adopted for cloud computing include: 1. MapReduce is the “only” widely adopted computing model, and there are a number of variations of languages based on this model for task specification in the Cloud. 2. White-Box approach: MapReduce and its variations require application logic to be rewritten to follow the map-reduce-merge programming model. Thus, users need to fully understand the applications and port the applications before they can leverage the parallel computing infrastructure. 3. Black-Box approach: SwiftScript serves as a general purpose coordination language, where existing applications can be invoked without modification. 4. Mash-up’s and ad hoc scripts (Java Script, PHP, Python, etc.) have become key technologies for developing Web applications that dynamically integrate multiple data or service sources. 1/29/2014 CSC 8710 - Presentation I 36
    37. 37. Cloud Workflow – The Challenges Language Challenge (cont’d): The language challenges includes the following: 1. Handle the mapping from input and output data into logical structures to facilitate data integration and logical operations on data. 2. Support large-scale parallelism via either implicit parallelism, or explicit declaratives such as Parallel Foreach. 3. Support data partitioning and task partitioning. 4. Require a scalable, reliable, and efficient runtime system that can support Cloud-scale task scheduling and dispatching, provide error recovery and fault tolerance under all kinds of hardware and service failures, and utilize a large pool of Cloud resources efficiently. 1/29/2014 CSC 8710 - Presentation I 37
    38. 38. Cloud Workflow – The Challenges Computing Challenge: The computing challenges includes the following: 1. Managing large-scale of computing resources. 2. Workflow systems may not be able to talk to Cloud resources directly, they may still need go through middleware services such as Nimbus and Falkon that handle resource provisioning and task dispatching. 3. Workflow resource requirements, data dependencies, Cloud virtualization, etc makes thing even more complicated. 4. Additional measures is needed to support large workflows and components. 1/29/2014 CSC 8710 - Presentation I 38
    39. 39. Cloud Workflow – The Challenges Data Management Challenge: The data management challenges includes the following: 1. Analyzing, visualizing, and disseminating of large data sets. 2. Management of data resources and dataflow between the storage and resources in data intensive applications. The following aspects of data management within a Cloud are important from a workflow perspective: 1. Data Locality: a. Location of the data relative to the available computational resources. Moving data repeatedly to distant CPUs is expensive and inefficient. b. Data need to be distributed over many computers to achieve good scalability. 2. Combining compute and data resource management. 3. Scalability of Clouds require scalable provenance systems to handle storage and querying of potentially millions of tasks. 1/29/2014 CSC 8710 - Presentation I 39
    40. 40. Cloud Workflow – The Challenges Service Management Challenge: The service management (Orchestrating and invoking services via an SWFMS) challenges includes the following: 1. Service description, discovery, and composition. 2. Managing the large number of service instances. 3. Data movements across service instances involving large data volumes. 4. For a workflow to invoke publicly available services SWFMS also needs to handle security, interoperability and data transformation issues. 1/29/2014 CSC 8710 - Presentation I 40
    41. 41. Cloud Workflow – The Challenges Storage Challenge: The storage challenges includes the following: 1. Commercial clouds often deploy structured or object-based storage services that can be utilized by workflow applications. 2. In the absence of standard file system interfaces, the application codes must either be modified to interface with the storage services, or must be wrapped with additional workflow components to do the translation. 3. Deploying a temporary shared file system in the cloud as part of a virtual cluster is complex, potentially costly and requires additional step to ensure that desired outputs are transferred to permanent storage. 4. Storage security. 1/29/2014 CSC 8710 - Presentation I 41
    42. 42. Cloud Workflow – The Challenges Network and Tools Challenge: The network and tools challenges includes the following: 1. Data-intensive workflows depend on high-performance networks to achieve good performance. 2. Requires high-throughput, but not necessarily low latency, and faster networks. 3. Setting up an environment to run workflows in the cloud. 4. There is some work in virtual appliances, but those are typically designed for single nodes and not for clusters of nodes. 1/29/2014 CSC 8710 - Presentation I 42
    43. 43. Research Directions 1/29/2014 CSC 8710 - Presentation I 43
    44. 44. Cloud Workflow – The Research Directions The key areas for research efforts in cloud based workflows: Architecture: 1. Implement the key components in the different layers of the SWFMS architecture, with interoperability and reusability. This would help us leverage existing Cloud technologies, such as monitoring data management, resource provisioning, etc. 2. Leverage middleware technologies that bridge existing workflow systems with the Cloud to be more cost effective. Scripting: 1. Scripting has the advantage of being concise and flexible, yet powerful when combined with parallel semantics and logical operations. 2. Expect to see scripting languages that have a mixture of these semantics, combining the coordination of applications and services. Cost: 1. Analyze the cost for computation and resource utilization to estimate and optimize the ROI. 1/29/2014 CSC 8710 - Presentation I 44
    45. 45. Cloud Workflow – The Research Directions Provenance: 1. Can adopt the SOA model making provenance less coupled with an SWFMS than it currently does. Security: 1. It is the first major service that needs to be provided by a Cloud provider. a. Access Control: Due to the dynamic nature and the large-scale data, metadata, and service sharing nature of the Cloud, access control is a challenging but important research problem. b. Information Control Flow: Since a scientific workflow might orchestrate a large number of distributed services, data, and applications, particularly in a large-scale Cloud environment, the mechanism that controls mission-critical information and intellectual property not being propagated to an unauthorized user is important. c. Secure electronic transaction protocol: To prevent the abuse of Cloud accounts and double or wrong charges by a Cloud provider further research might be needed to ensure the security of Cloud-based transaction protocol. 1/29/2014 CSC 8710 - Presentation I 45
    46. 46. Closing Notes • The benefit of cloud computing for science is not necessarily in its utility computing and economic aspects, which are not new for academic computing. The benefit of clouds is rather in its technological features that stem from service-oriented architecture and virtualization • Much work is needed to bring cloud platforms up to the performance level of the grid. This includes developing cloud storage systems that are appropriate for workflow and other science applications as well as tools to help scientists and workflow engineers deploy their applications in the cloud. • As more and more customers and applications migrate into Cloud, the requirement to have workflow systems to manage the ever more complex task dependencies, and to handle issues such as large parameter space exploration, smart reruns, and provenance tracking will become more urgent. • Cloud needs more structured and mature workflow technologies, and vice versa, as Cloud offers unprecedented scalability to workflow systems, and could potentially change the way we perceive and conduct scientific experiments. 1/29/2014 CSC 8710 - Presentation I 46
    47. 47. Thank You 1/29/2014 CSC 8710 - Presentation I 47