Data-Intensive
Research Workshop
Soaring through clouds with Meandre



Xavier Llorà and Bernie Ács
xllora@illinois.edu
be...
Part 1: Cloud Overview & Introduction
   •  Basic Cloud Concepts
        •  An Ideological Metaphor & Definition
        •...
An Ideological Metaphor & Definition
  •  Cloud Metaphor
       •  The term cloud is used as a metaphor for
          the ...
An Example: TechNet Virtual Labs
                                 3
    2




1




http://www.microsoft.com/events/vlabs/...
Step 1: Builds Lab




Imaginations unbound
Step 2: Lab is Ready




Imaginations unbound
Step 3: Controlling with Lab Machines




Imaginations unbound
Step 4: Interacting with Virtual Machines




Imaginations unbound
The Tutorial Session Can Be Freely Used




Imaginations unbound
Cloud Classification Types
   •  Public cloud or external cloud describes cloud
      computing in the traditional mainstr...
Cloud Computing Models

   •  Infrastructure as a Service (IaaS)
        •  the delivery of computer infrastructure (typic...
Cloud Computing Models

   •  Platform as a Service (PaaS)
        •  delivery of a computing platform and solution stack ...
Cloud Computing Models

   •  Software as a Service (SaaS)
        •  is a model of software deployment whereby a provider...
NCSA Virtual Machines & Enterprise Cloud




Imaginations unbound
NCSA Uses Virtual Machine Technologies
   •  Virtual machine technology to support projects &
      services using VMware,...
NCSA Enterprise Cloud
   •  Virtual Machine Infrastructure Expansion
        •  Dedicated Resources
            •  176 Cor...
NCSA Enterprise Cloud User Tools
   •  Command Line Tools
        •  Amazon Web Services API compatible tools (euca-*)
   ...
NCSA Enterprise Cloud User Tools
   •  Command Line Tools
        •  Amazon Web Services API compatible tools (euca-*)
   ...
NCSA Enterprise Cloud User Tools
   •  Command Line Tools
        •  Amazon Web Services API compatible tools (euca-*)
   ...
NCSA Enterprise Cloud User Tools
   •  Command Line Tools
        •  Amazon Web Services API compatible tools (euca-*)
   ...
NCSA Enterprise Cloud User Tools
   •  Command Line Tools
        •  Amazon Web Services API compatible tools (euca-*)
   ...
NCSA Enterprise Cloud User Tools
   •  Command Line Tools
        •  Amazon Web Services API compatible tools (euca-*)
   ...
NCSA Enterprise Cloud Conduits

   •  Private Cloud to Grid Conduit
        •  Dynamically Scalable Web Front-end & Middle...
Part 2: Cloud Programming Paradigm

   •  How are Software Architecture and Design Impacted by
      Virtual Machines & Cl...
MONK Project – GSLIS




The SEASR project and its Meandre infrastructure!
are sponsored by The Andrew W. Mellon Foundation
Feature Lens Blow up




The SEASR project and its Meandre infrastructure!
are sponsored by The Andrew W. Mellon Foundation
Date Entities to Simile Timeline




The SEASR project and its Meandre infrastructure!
are sponsored by The Andrew W. Mell...
Analyzing CSPAN Archives




The SEASR project and its Meandre infrastructure!
are sponsored by The Andrew W. Mellon Found...
NEMA – Son of Blinkie - GSLIS




The SEASR project and its Meandre infrastructure!
are sponsored by The Andrew W. Mellon ...
NESTER – GSLIS




The SEASR project and its Meandre infrastructure!
are sponsored by The Andrew W. Mellon Foundation
NESTER - Birdie Audio – GSLIS
NESTER - Birdie Audio – GSLIS
Imaginations unbound
Evolution Highway – IGB




The SEASR project and its Meandre infrastructure!
are sponsored by The Andrew W. Mellon Founda...
Fedora Commons Repository
                                   Components & Flows




                                      ...
Twitter For Research




The SEASR project and its Meandre infrastructure!
are sponsored by The Andrew W. Mellon Foundation
Data-intensive Computing for the Cloud




Imaginations unbound
Data-intensive Computing for the Cloud

•  Meandre
  •    Integrates within Existing Applications
  •    May be a Free Sta...
Meandre: The Dataflow Component

     •  Data dictates component execution semantics

                Inputs               ...
Meandre: Flow (Complex Tasks)

     •  A flow is a collection of connected components


                      Read
        ...
Meandre Connectors
Flows are made up of “One or More” components
with “None to Many” connectors that are described        ...
Meandre: ZigZag Script Language

   •  Automatic Parallelization 
          •  Adding the operator [+4] would result in a ...
Scaling Genetic Algorithms with Meandre

      Intel 2.8Ghz QuadCore, 4Gb RAM. Average of 20 runs.	





Imaginations unbo...
And Beyond with Hadoop

      60 Dual Quad Core Xeons with 8GB RAM. GB Ethernet	





                                    ...
Are Components Black-Box Wrappers?

•  Programming Components is multilingual
   •  Natively support: Java, Scala, Python,...
Meandre Components for
 Amazon & Eucalyptus
Cloud Conduits to the Grid
•  Cloud mechanics have a steep learning curve
•  Can Meandre help simplify the process?
•  Orc...
Meandre Cloud Orchestration Data Flow
Conclusions

   •  Next generation data-intensive applications will:
        •    Use cloud computing technologies and con...
Upcoming SlideShare
Loading in …5
×

Soaring the Clouds with Meandre

1,218 views
1,164 views

Published on

Description of NCSA's cloud effort and how to orchestrate clouds using meandre

Published in: Technology, Business
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,218
On SlideShare
0
From Embeds
0
Number of Embeds
57
Actions
Shares
0
Downloads
0
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Soaring the Clouds with Meandre

  1. 1. Data-Intensive Research Workshop Soaring through clouds with Meandre Xavier Llorà and Bernie Ács xllora@illinois.edu bernie@ncsa.illinois.edu National Center for Supercomputing Applications University of Illinois at Urbana-Champaign
  2. 2. Part 1: Cloud Overview & Introduction •  Basic Cloud Concepts •  An Ideological Metaphor & Definition •  Example: TechNet Virtual Labs •  Cloud Classification Types •  Public, Private, & Hybrid Deployments •  Cloud Computing Models •  Infrastructure aaS, Platform aaS, & Software aaS •  NCSA Virtual Machines & Enterprise Cloud •  VMWare, Xen, & Eucalyptus •  ElasticFox & AMS Web Application •  NCSA Cloud Conduits •  Cloud Computing & Programming Paradigms Imaginations unbound
  3. 3. An Ideological Metaphor & Definition •  Cloud Metaphor •  The term cloud is used as a metaphor for the Internet, based on how it is depicted in computer network diagrams and is an abstraction for the complex infrastructure it conceals •  Cloud Computing – Definition •  The first academic use of this term appears to define it as a computing paradigm where the boundaries of computing will be determined by economic rationale rather than technical limits. •  Cloud computing is a paradigm of computing in which dynamically scalable and often virtualized resources are provided as a service over the Internet. Users need not have knowledge of, expertise in, or control over the technology infrastructure in the "cloud" that supports them http://en.wikipedia.org/wiki/Cloud_computing Imaginations unbound
  4. 4. An Example: TechNet Virtual Labs 3 2 1 http://www.microsoft.com/events/vlabs/defaults.aspx Imaginations unbound
  5. 5. Step 1: Builds Lab Imaginations unbound
  6. 6. Step 2: Lab is Ready Imaginations unbound
  7. 7. Step 3: Controlling with Lab Machines Imaginations unbound
  8. 8. Step 4: Interacting with Virtual Machines Imaginations unbound
  9. 9. The Tutorial Session Can Be Freely Used Imaginations unbound
  10. 10. Cloud Classification Types •  Public cloud or external cloud describes cloud computing in the traditional mainstream sense, whereby resources are dynamically provisioned on a fine-grained, self-service basis over the Internet, via web applications/ web services, from an off-site third-party provider who shares resources and bills on a fine-grained utility computing basis •  Private cloud and internal cloud is a neologism that describe configurations that emulate (public) cloud computing on private networks •  Hybrid cloud consists of multiple internal and/or external cloud deployments http://en.wikipedia.org/wiki/Cloud_Computing Imaginations unbound
  11. 11. Cloud Computing Models •  Infrastructure as a Service (IaaS) •  the delivery of computer infrastructure (typically a platform virtualization environment) as a service •  Rather than purchasing servers, software, data center space or network equipment, clients instead buy those resources as a fully outsourced service. •  The service is typically billed on a utility computing basis and amount of resources consumed (and therefore the cost) will typically reflect the level of activity. •  Supersedes term Hardware as a Service (HaaS) •  It is an evolution of web hosting and virtual private server offerings. •  Example: Amazon EC2/S3 services http://en.wikipedia.org/wiki/Infrastructure_as_a_service Imaginations unbound
  12. 12. Cloud Computing Models •  Platform as a Service (PaaS) •  delivery of a computing platform and solution stack as a service •  It facilitates deployment of applications without the cost and complexity of buying and managing the underlying hardware and software layers, providing all of the facilities required to support the complete life cycle of building and delivering web applications and services entirely available from the Internet —with no software downloads or installation for developers, IT managers or end-users •  Open Platform as a Service (OPaaS) •  another step in the Application Service Provider, SaaS, PaaS evolution •  Example: Microsoft TechNet VLabs http://en.wikipedia.org/wiki/Platform_as_a_service Imaginations unbound
  13. 13. Cloud Computing Models •  Software as a Service (SaaS) •  is a model of software deployment whereby a provider licenses an application to customers for use as a service on demand •  vendors may host the application on their own web servers or download the application to the consumer device, disabling it after use or after the on-demand contract expires •  Examples: •  Google Apps (Maps, Docs, and Others) •  Adobe (Connect & Buzzword) •  Microsoft (Workspace office live) http://en.wikipedia.org/wiki/Platform_as_a_service Imaginations unbound
  14. 14. NCSA Virtual Machines & Enterprise Cloud Imaginations unbound
  15. 15. NCSA Uses Virtual Machine Technologies •  Virtual machine technology to support projects & services using VMware, XenServer, & Others •  An Example Case: ICLCS & WebMO •  Institute for Chemistry Literacy Through Computational Science (http://Iclcs.uiuc.edu/workshops & http://www.webmo.net/) Shared Network File System Passive LB Node Centralize Active LB Node Relational Database Internet Users Worker Worker Internet Users Worker Node Worker Node Internet Users Node Worker Node Internet Users Internet Users Node Imaginations unbound
  16. 16. NCSA Enterprise Cloud •  Virtual Machine Infrastructure Expansion •  Dedicated Resources •  176 Cores/18 Machines with 50TB Storage and 40Gb IB •  Dedicated Switches, Network services for VM & Cloud. •  Eucalyptus installation base •  “Amazon at home” •  EC2/S3/EBS •  Potential future support for •  dynamic load-balanced services & load-based procurement •  High degree of variability possible in configurations •  Account based virtual private enterprise •  Elastic IP, Elastic Block Storage, & Elastic Computing •  Empowers users versus Constrains users •  Cloud mechanics require a steep learning curve Imaginations unbound
  17. 17. NCSA Enterprise Cloud User Tools •  Command Line Tools •  Amazon Web Services API compatible tools (euca-*) •  Customizations and Refinements •  ElasticFox (Version 1.6) •  FireFox plugin works well; has required modification, more to do. List, Launch, & Manage Images Imaginations unbound
  18. 18. NCSA Enterprise Cloud User Tools •  Command Line Tools •  Amazon Web Services API compatible tools (euca-*) •  Customizations and Refinements •  ElasticFox (Version 1.6) •  FireFox plugin works well; has required modification, more to do. Enterprise Security Rules Imaginations unbound
  19. 19. NCSA Enterprise Cloud User Tools •  Command Line Tools •  Amazon Web Services API compatible tools (euca-*) •  Customizations and Refinements •  ElasticFox (Version 1.6) •  FireFox plugin works well; has required modification, more to do. SSH Key-Pair Management Imaginations unbound
  20. 20. NCSA Enterprise Cloud User Tools •  Command Line Tools •  Amazon Web Services API compatible tools (euca-*) •  Customizations and Refinements •  ElasticFox (Version 1.6) •  FireFox plugin works well; has required modification, more to do. Allocate, Assign, & Associate Elastic IP Imaginations unbound
  21. 21. NCSA Enterprise Cloud User Tools •  Command Line Tools •  Amazon Web Services API compatible tools (euca-*) •  Customizations and Refinements •  ElasticFox (Version 1.6) •  FireFox plugin works well; has required modification, more to do. Allocate, Assign, & Associate Elastic Block Storage Imaginations unbound
  22. 22. NCSA Enterprise Cloud User Tools •  Command Line Tools •  Amazon Web Services API compatible tools (euca-*) •  Customizations and Refinements •  AWS Manager •  Statically deployed Web-Application Imaginations unbound
  23. 23. NCSA Enterprise Cloud Conduits •  Private Cloud to Grid Conduit •  Dynamically Scalable Web Front-end & Middleware Layers •  Next Generation WebMO “Science Gateway” •  Batch Queue Proxy Integration, Metering, and Monitoring •  Private Cloud to Private Cloud Conduit •  Exploring Transparent Integration with Remote Sites •  UIUC Computer Science Hadoop Cluster •  Dynamic Integration with other Eucalyptus Site •  Private Cloud to Public Cloud Conduit •  Exploring Transparent Integration with Amazon EC2 Service •  Roles of Virtual Private Network Services •  Dynamic Scalability and Data Localities Imaginations unbound
  24. 24. Part 2: Cloud Programming Paradigm •  How are Software Architecture and Design Impacted by Virtual Machines & Cloud technologies? •  Natural Match for Multi-tier applications •  To best leverage cloud technology applications need to be more modular and less monolithic •  Service orientated architecture can benefit from JeOS (Just Enough Operating System) platforms and •  Can be easily configured to dynamically scale •  Meandre: Overview & Introduction •  Agile Infrastructure for Data Intensive Applications •  Semantic Orientated Component Based Architecture •  Data Driven Execution Paradigm •  SEASR Application Examples Imaginations unbound
  25. 25. MONK Project – GSLIS The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  26. 26. Feature Lens Blow up The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  27. 27. Date Entities to Simile Timeline The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  28. 28. Analyzing CSPAN Archives The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  29. 29. NEMA – Son of Blinkie - GSLIS The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  30. 30. NESTER – GSLIS The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  31. 31. NESTER - Birdie Audio – GSLIS
  32. 32. NESTER - Birdie Audio – GSLIS
  33. 33. Imaginations unbound
  34. 34. Evolution Highway – IGB The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  35. 35. Fedora Commons Repository Components & Flows Interactive Web Application Web Service The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  36. 36. Twitter For Research The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  37. 37. Data-intensive Computing for the Cloud Imaginations unbound
  38. 38. Data-intensive Computing for the Cloud •  Meandre •  Integrates within Existing Applications •  May be a Free Standing Service •  Capitalize on elasticity •  Provide complex data computing as a service •  Collocating computation and data •  Natively access data in the cloud •  Hadoop Distributed File System (HDFS) •  Document stores •  KeyValue stores •  Relational stores
  39. 39. Meandre: The Dataflow Component •  Data dictates component execution semantics Inputs Outputs Component P Descriptor in RDF" The component " of its behavior implementation The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  40. 40. Meandre: Flow (Complex Tasks) •  A flow is a collection of connected components Read P Merge P Get Show P P Do P Dataflow execution The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  41. 41. Meandre Connectors Flows are made up of “One or More” components with “None to Many” connectors that are described Flows may contain connectors that to the Mendre Server for management are cyclical over one or more components Flows must contain at minimum one component with NO Inputs to cause an Execute call to be made. *Outputs are Always Optional. Flow components may have multiple connectors assigned to any input data port Flows can have any number of components with  “None to Many” Inputs data port s  “None to Many” Output data ports The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  42. 42. Meandre: ZigZag Script Language •  Automatic Parallelization •  Adding the operator [+4] would result in a directed grap # Describes the data-intensive flow # Describes the data-intensive flow # # @pu = push() @pu = push() @pt = pass( string:pu.string ) [+4] @pt = pass( string:pu.string ) [+4!] print( object:pt.string ) print( object:pt.string ) The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  43. 43. Scaling Genetic Algorithms with Meandre Intel 2.8Ghz QuadCore, 4Gb RAM. Average of 20 runs. Imaginations unbound
  44. 44. And Beyond with Hadoop 60 Dual Quad Core Xeons with 8GB RAM. GB Ethernet Resources exhaustion Imaginations unbound
  45. 45. Are Components Black-Box Wrappers? •  Programming Components is multilingual •  Natively support: Java, Scala, Python, and Clojure •  Easily Wrap: R, C, and C++ •  Components can also interact with the OS •  Leverage OS tools •  Orchestrate other programs •  The question: •  Can Meandre help orchestrate and facilitate interaction and cooperation between cloud and grid assets?
  46. 46. Meandre Components for Amazon & Eucalyptus
  47. 47. Cloud Conduits to the Grid •  Cloud mechanics have a steep learning curve •  Can Meandre help simplify the process? •  Orchestrating clouds with Meandre •  Amazon/Eucalyptus model •  Components can be created to: •  List images •  List instances •  Launch instances •  Allocate Elastic IP and Elastic Block Storage •  Transfer Data or Programs to running instances •  Trigger process computation •  Monitor processes and/or executing persistent services •  Terminate instances
  48. 48. Meandre Cloud Orchestration Data Flow
  49. 49. Conclusions •  Next generation data-intensive applications will: •  Use cloud computing technologies and conduits •  Require adaptation of programming paradigms •  Leverage a flexible architecture and a modular •  Promote processing and resources at scale. •  Meandre •  Data-intensive execution engine •  Component-based programming architecture •  Distributed data flow designs to allow processing to be co- located with data sources and enable transparent scalability •  Orchestrate cloud deployments •  Leverage cloud conduits Imaginations unbound

×