SEASR: 

                 Meandre: !
        Semantic-Driven Data-Intensive !
            Flows in the Clouds 
         Xa...
Yes, It is not a Typo 
SEASR: Design Goals
      •  Transparency
             –  From a single laptop to a HPC cluster
             –  Not bound ...
Meandre: Infrastructure
  •  SEASR/Meandre Infrastructure:
         –  Dataflow execution paradigm
         –  Semantic-web...
Meandre: Data Driven Execution
      •  Execution Paradigms
             –  Conventional programs perform computational ta...
Meandre: Dataflow Example



                                                    Value1
                                   ...
Meandre: Dataflow Example
      •  Dataflow Addition Example 
             –  Logical Operation ‘+’
                        ...
Meandre: The Dataflow Component
     •  Data dictates component execution semantics

                Inputs                ...
Meandre: Component Metadata
      •  Describes a component
      •  Separates: 
             –  Components semantics (blac...
Meandre: Semantic Web Concepts
      •  Relies on the usage of the resource description framework
         (RDF) which use...
Meandre: Metadata Ontologies
      •  Meandre's metadata relies on three ontologies: 
             –  The RDF ontology ser...
Meandre: Components in RDF
 @prefix   meandre:    <http://www.meandre.org/ontology/> .
 @prefix   xsd:       <http://www.w...
Meandre: Components Types
      •  Components are the basic building block of any
         computational task. 

      •  ...
Meandre: Component Assemblies
      •  Defined by connecting outputs from one component to the
         inputs of another.
...
Meandre: Flow (Complex Tasks)
     •  A flow is a collection of connected components


                      Read
         ...
Meandre: Create, Publish, & Share
      •  “Components” and “Flows” have RDF descriptors
             –  Easily shared, fo...
Meandre: Repository & Locations
      •  Each location represents a set components/flows
      •  Users can
             – ...
Meandre: Metadata Properties
      •  Components and Flows share properties such as
         component name, creator, crea...
And, Switch!
Wrapping With Components
•  Component provides inputs, outputs, properties
•  You code 
  –  Inside!
  –  Call from!
  –  ...
Meandre: Programming Paradigm 

      •  The programming paradigm creates complex
         tasks by linking together a bun...
Meandre: Workbench Existing Flow

  Components




    Flows




     Locations




The SEASR project and its Meandre infr...
Meandre: ZigZag Script Language
      •  ZigZag is a simple language for describing data-
         intensive flows
        ...
Meandre: ZigZag Script Language
      •  As an example the Flow Diagram
             –  The flow below pushes two strings t...
Meandre: ZigZag Script Language
   •  ZigZag code that represents example flow:
                         #
                ...
Meandre: ZigZag Script Language
   •  Automatic Parallelization 
          –  Multiple instances of a component could be r...
Meandre: ZigZag Script Language
   •  Automatic Parallelization 
          –  Adding the operator [+AUTO] to middle compon...
Meandre: ZigZag Script Language
   •  Automatic Parallelization 
          –  Adding the operator [+4] would result in a d...
Meandre: Flows to MAU
      •  Flows can be executed using their RDF
         descriptors
      •  Flows can be compiled i...
Meandre: The Architecture
      •  The design of the Meandre architecture follows
         three directives: 
            ...
Meandre: Basic Single Server




The SEASR project and its Meandre infrastructure!
are sponsored by The Andrew W. Mellon F...
Meandre MDX: Cloud Computing
      •  Servers can be 
             –  instantiated on demand
             –  disposed when...
Meandre MDX: The Picture
      MDX
Backbone





The SEASR project and its Meandre infrastructure!
are sponsored by The An...
Meandre MDX: The Architecture
      •  Virtualization infrastructure
             –  Provide a uniform access to the under...
Meandre MDX: The Architecture
      •  Data-intensive flow infrastructure
             –  Provide the basic Meandre executi...
SEASR: 

                 Meandre: !
        Semantic-Driven Data-Intensive !
            Flows in the Clouds 
         Xa...
Upcoming SlideShare
Loading in...5
×

Meandre Architecture

1,086

Published on

Pathway to SEASR Workshop in March 2009 in North Carolina

Published in: Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,086
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
18
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Meandre Architecture

  1. 1. SEASR: Meandre: ! Semantic-Driven Data-Intensive ! Flows in the Clouds Xavier Llora, Bernie Acs, Loretta Auvil, Boris Capitanu, Michael Welge, David Goldberg National Center for Supercomputing Applications! University of Illinois at Urbana-Champaign {xllora, acs1, lauvil, capitanu, mwelge, deg}@illinois.edu The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  2. 2. Yes, It is not a Typo 
  3. 3. SEASR: Design Goals •  Transparency –  From a single laptop to a HPC cluster –  Not bound to a particular computation fabric –  Allow heterogeneous development •  Intuitive programming paradigm –  Modular Components, Flows, and Reusable –  Foster Collaboration and Sharing •  Open Source •  Service Orientated Architecture (SOA) The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  4. 4. Meandre: Infrastructure •  SEASR/Meandre Infrastructure: –  Dataflow execution paradigm –  Semantic-web driven –  Web Oriented –  Supports publishing services –  Modular components –  Encapsulation and execution mechanism –  Promotes reuse, sharing, and collaboration The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  5. 5. Meandre: Data Driven Execution •  Execution Paradigms –  Conventional programs perform computational tasks by executing a sequence of instructions. –  Data driven execution revolves around the idea of applying transformation operations to a flow or stream of data when it is available. •  Dataflow Approach –  May have zero to many inputs –  May have zero to many outputs –  Performs a logical operation when data is available The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  6. 6. Meandre: Dataflow Example Value1 Sum Value2 The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  7. 7. Meandre: Dataflow Example •  Dataflow Addition Example –  Logical Operation ‘+’ Value1 –  Requires two inputs Sum –  Produces one output Value2 •  When two inputs are available –  Logical operation can be preformed –  Sum is output •  When output is produced –  Reset internal values –  Wait for two new input values to become available The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  8. 8. Meandre: The Dataflow Component •  Data dictates component execution semantics Inputs Outputs Component P Descriptor in RDF! The component ! of its behavior implementation The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  9. 9. Meandre: Component Metadata •  Describes a component •  Separates: –  Components semantics (black box) –  Components implementation •  Provides a unified framework: –  Basic building blocks or units (components) –  Complex tasks (flows) –  Standardized metadata The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  10. 10. Meandre: Semantic Web Concepts •  Relies on the usage of the resource description framework (RDF) which uses simple notation to express graph relations written usually as XML to provide a set of conventions and common means to exchange information •  Provides a common framework to share and reuse data across application, enterprise, and community boundaries •  Focuses on common formats for integration and combination of data drawn from diverse sources •  Pays special attention to the language used for recording how the data relates to real world objects •  Allows navigation to sets of data resources that are semantically connected. The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  11. 11. Meandre: Metadata Ontologies •  Meandre's metadata relies on three ontologies: –  The RDF ontology serves as a base for defining Meandre descriptors –  The Dublin Core Elements ontology provides basic publishing and descriptive capabilities in the description of Meandre descriptors –  The Meandre ontology describes a set of relationships that model valid components, as understood by the Meandre execution engine architecture The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  12. 12. Meandre: Components in RDF @prefix meandre: <http://www.meandre.org/ontology/> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . Existing! @prefix dc: <http://purl.org/dc/elements/1.1/> . Standards @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix : <#> . <http://dita.ncsa.uiuc.edu/meandre/e2k/components/limited-iterations> meandre:name quot;Limited iterationsquot;^^xsd:string ; rdf:type meandre:executable_component ; dc:creator quot;Xavier Lloraquot;^^xsd:string ; dc:date quot;2007-11-17T00:32:35quot;^^xsd:date ; dc:description quot;Allows only a limited number of iterationsquot;^^xsd:string ; dc:format quot;java/classquot;^^xsd:string ; dc:rights quot;University of Illinois/NCSA Open Source Licensequot;^^xsd:string ; meandre:execution_context <http://norma.ncsa.uiuc.edu/public-dav/Meandre/demos/E2K/V1/resources/ colt.jar> , <http://norma.ncsa.uiuc.edu/public-dav/Meandre/demos/E2K/V1/resources/ gacore.jar> , <http://dita.ncsa.uiuc.edu/meandre/e2k/components/limited- The SEASR project and its Meandre infrastructure! iterations/implementation/> , are sponsored by The Andrew W. Mellon Foundation <http://norma.ncsa.uiuc.edu/public-dav/Meandre/demos/E2K/V1/ resources/gacore-meandre.jar> , <http://norma.ncsa.uiuc.edu/public-dav/Meandre/demos/E2K/V1/
  13. 13. Meandre: Components Types •  Components are the basic building block of any computational task. •  There are two kinds of Meandre components: –  Executable components •  Perform computational tasks that require no human interactions during runtime •  Processes are initialized during flow startup and are fired when in accordance to the policies defined for it. –  Control components •  Used to pause dataflow during user interaction cycles •  WebUI may be a HTML Form, Applet, or Other user interface The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  14. 14. Meandre: Component Assemblies •  Defined by connecting outputs from one component to the inputs of another. –  Cyclical connections are supported –  Components may have •  Zero to many inputs •  Zero to many output •  Properties that control runtime behavior •  Described using RDF –  Enables storage, reuse, and sharing like components –  Allows discovery and dynamic execution The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  15. 15. Meandre: Flow (Complex Tasks) •  A flow is a collection of connected components Read Merge P P Show Get P P Do P Dataflow execution The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  16. 16. Meandre: Create, Publish, & Share •  “Components” and “Flows” have RDF descriptors –  Easily shared, fosters sharing, & reuse –  Allow machines to read and interpret –  Independent of the implementations –  Combine different implementation & platforms –  Components: Java, Python, Lisp, Web Services –  Execution: On a Laptop or a High Performance Cluster •  A “Location” is RDF descriptor of one to many components, one to many flows, and their implementations The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  17. 17. Meandre: Repository & Locations •  Each location represents a set components/flows •  Users can –  Combine different locations together –  Create components –  Assemble flows –  Share components and flows •  Repositories Help –  Administrate complex environments –  Organize components and flows The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  18. 18. Meandre: Metadata Properties •  Components and Flows share properties such as component name, creator, creation date, description, tags, and rights. •  Components specific metadata to describe the components' behavior, it’s location, type of implementation, firing policy, runnable, format, resource location, and execution context •  Flow specific metadata describes the directed graph of components, components instances, connectors, connector instance data port source, connector, instance data port target, connector instance source, connector instance target, instance name The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  19. 19. And, Switch!
  20. 20. Wrapping With Components •  Component provides inputs, outputs, properties •  You code –  Inside! –  Call from! –  A WS front end –  Interactive application –  Request/response cycles
  21. 21. Meandre: Programming Paradigm •  The programming paradigm creates complex tasks by linking together a bunch of specialized components. Meandre's publishing mechanism allows components develop by third parties to be assembled in a new flow. •  There are two ways to develop flows : –  Meandre’s Workbench visual programming tool –  Meandre’s ZigZag scripting language The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  22. 22. Meandre: Workbench Existing Flow Components Flows Locations The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  23. 23. Meandre: ZigZag Script Language •  ZigZag is a simple language for describing data- intensive flows –  Modeled on Python for simplicity. –  ZigZag is declarative language for expressing the directed graphs that describe flows. •  Command-line tools allow ZigZag files to compile and execute. –  A compiler is provided to transform a ZigZag program (.zz) into Meandre archive unit (.mau). –  Mau(s) can then be executed by a Meandre engine. The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  24. 24. Meandre: ZigZag Script Language •  As an example the Flow Diagram –  The flow below pushes two strings that get concatenated and printed to the console –  The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  25. 25. Meandre: ZigZag Script Language •  ZigZag code that represents example flow: # # Imports the three required components and creates the component aliases # import <http://localhost:1714/public/services/demo_repository.rdf> alias <http://test.org/component/push_string> as PUSH alias <http://test.org/component/concatenate-strings> as CONCAT alias <http://test.org/component/print-object> as PRINT # # Creates four instances for the flow # push_hello, push_world, concat, print = PUSH(), PUSH(), CONCAT(), PRINT() # # Sets up the properties of the instances # push_hello.message, push_world.message = quot;Hello quot;, quot;world!quot; # # Describes the data-intensive flow # @phres, @pwres = push_hello(), push_world() @cres = concat( string_one: phres.string; string_two: pwres.string ) print( object: cres.concatenated_string ) # The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  26. 26. Meandre: ZigZag Script Language •  Automatic Parallelization –  Multiple instances of a component could be run in parallel to boost throughput. –  Specialized operator available in ZigZag Scripting to cause multiple instances of a given component to used •  Consider a simple flow example show in the diagram •  The dataflow declaration would look like # # Describes the data-intensive flow # @pu = push() @pt = pass( string:pu.string ) print( object:pt.string ) The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  27. 27. Meandre: ZigZag Script Language •  Automatic Parallelization –  Adding the operator [+AUTO] to middle component # Describes the data-intensive flow # @pu = push() @pt = pass( string:pu.string ) [+AUTO] print( object:pt.string ) –  [+AUTO] tells the ZigZag compiler to parallelize the “pass component instance” by the number of cores available on system. –  [+AUTO] may also be written [+N] where N is an numeric value to use for example [+10]. The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  28. 28. Meandre: ZigZag Script Language •  Automatic Parallelization –  Adding the operator [+4] would result in a directed grap # Describes the data-intensive flow # Describes the data-intensive flow # # @pu = push() @pu = push() @pt = pass( string:pu.string ) [+4] @pt = pass( string:pu.string ) [+4!] print( object:pt.string ) print( object:pt.string ) The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  29. 29. Meandre: Flows to MAU •  Flows can be executed using their RDF descriptors •  Flows can be compiled into MAU •  MAU is: –  Self-contained representation –  Ready for execution –  Portable –  The base of flow execution in grid environments The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  30. 30. Meandre: The Architecture •  The design of the Meandre architecture follows three directives: –  provide a robust and transparent scalable solution from a laptop to large-scale clusters –  create an unified solution for batch and interactive tasks –  encourage reusing and sharing components •  To ensure such goals, the designed architecture relies on four stacked layers and builds on top of service-oriented architectures (SOA) The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  31. 31. Meandre: Basic Single Server The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  32. 32. Meandre MDX: Cloud Computing •  Servers can be –  instantiated on demand –  disposed when done or on demand •  A cluster is formed by at least one server •  The Meandre Distributed Exchange (MDX) –  Orchestrates operational integrity by managing cluster configuration and membership using a shared database resource. The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  33. 33. Meandre MDX: The Picture MDX
Backbone
 The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  34. 34. Meandre MDX: The Architecture •  Virtualization infrastructure –  Provide a uniform access to the underlying execution environment. It relies on virtualization of machines and the usage of Java for hardware abstraction. •  IO standardization –  A unified layer provides access to shared data stores, distributed file-system, specialized metadata stores, and access to other service-oriented architecture gateways. The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  35. 35. Meandre MDX: The Architecture •  Data-intensive flow infrastructure –  Provide the basic Meandre execution engine for data-intensive flows, component repositories and discovery mechanisms, extensible plugins and web user interfaces (webUIs). •  Interaction layer –  Can provide self-contained applications via webUIs, create plugins for third-party services, interact with the embedding application that relies on the Meandre engine, or provide services to the cloud. The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  36. 36. SEASR: Meandre: ! Semantic-Driven Data-Intensive ! Flows in the Clouds Xavier Llora, Bernie Acs, Loretta Auvil, Boris Capitanu, Michael Welge, David Goldberg National Center for Supercomputing Applications! University of Illinois at Urbana-Champaign {xllora, acs1, lauvil, capitanu, mwelge, deg}@illinois.edu The SEASR project and its Meandre infrastructure! are sponsored by The Andrew W. Mellon Foundation
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×