SlideShare a Scribd company logo
1 of 102
Scientific and Grid Workflow
                   Management

                     Cesare Pautasso
                   University of Lugano
                 http://www.pautasso.info


8.5.2009            Scientific and Grid Workflow (Cesare Pautasso)   1
Abstract
 Grid workflow management systems coordinate multiple job
 submissions over heterogeneous Grid resources.
 They feature visual programming environments to give scientist
 a high-level view over distributed computations composed of
 Grid services.
 This brief introduction to the field of scientific and Grid
 workflows includes a survey of selected workflow management
 tools and outlines current research trends.




  Swiss Grid School SGS‘09 @ GPC’09, Geneva, Switzerland
8.5.2009               Scientific and Grid Workflow (Cesare Pautasso)   2
Cesare Pautasso
           Ph.D. at ETH Zürich (2004)
           Post-Doc at ETH Zürich in the Systems (IKS) Group
             • Software: JOpera: Process Support for more than Web services
                                                                  http://www.jopera.org/
           Researcher at IBM Zurich Research Lab (2007)
           Assistant Professor at the new Faculty of Informatics,
           University of Lugano (USI), Switzerland (since September 2007)
             • USI Representative in the SwiNG Assembly (since 2007)
             • Grid Workflow Working Group Lead (since 2007)


           More Information: http://www.pautasso.info/
           Follow me on: http://twitter.com/pautasso/

8.5.2009                            Scientific and Grid Workflow (Cesare Pautasso)         3
Acknowledgements
           Some material contained in this tutorial was adapted from slides
           originally published by:
                                       Gustavo Alonso
                                         Win Bausch
                                        Ewa Deelman
                                          Ian Foster
                                         Yolanda Gil
                                        Carole Goble
                                         Roy Grønmo
                                       Thomas Heinis
                                      Rajesh Kalyanam
                                       Francesco Lelli
                                        Omer F. Rana
                                        Heiko Schuldt
                                       Frank Terpstra

8.5.2009                           Scientific and Grid Workflow (Cesare Pautasso)   4
Outline
 Why Workflow Management on the Grid?
 Discussion: Scientific vs. Grid vs. Business Workflows
     • Some Application Examples
 Workflow Modeling Languages and Tools Overview
     • Grid Workflow Language Patterns
 Running Workflows on the Grid
     • JOpera: Scientific Workflow for Eclipse
     • Workflows and Provenance




8.5.2009                      Scientific and Grid Workflow (Cesare Pautasso)   5
Why Workflow Management on
                    the Grid?




8.5.2009            Scientific and Grid Workflow (Cesare Pautasso)   6
Kinds of Grid Computation
 One Job Submission                           Parameter Sweep




8.5.2009              Scientific and Grid Workflow (Cesare Pautasso)   7
Variablility
                                                                                    and error
  Condition A                                                                      assessment
                                                                                                  Significance
                                                                                                  assessment
                                                                                        Determine
                                                                                      parameters for       likelihood for differential
                                                     Annotate and
                                                                                       error model            expression for each
                                                      normalize
                                                        data                                                         gene
                                                                                        Data
                                  MicroArray                                        Preprocessing
                wet lab            Scanner
              processing
                                                                                                    Clustering

                                                                     Extract raw
                                                                    spot intensities

                                                                                                Determine
                          1 spot = 1 gene                                                       Expression
                          Expression level:                      Image processing                Pattern
  Condition B             Green: A > B
                          Red: A < B
                          Black: A = B
      Cell
8.5.2009     population                        Scientific and Grid Workflow (Cesare Pautasso)                                       8
Variablility

  Condition A
                            in vitro in silico                                      and error
                                                                                   assessment
                                                                                                  Significance
                                                                                                  assessment
                                                                                        Determine
                                                                                      parameters for       likelihood for differential
                                                     Annotate and
                                                                                       error model            expression for each
                                                      normalize
                                                        data                                                         gene
                                                                                        Data
                                  MicroArray                                        Preprocessing
                wet lab            Scanner
              processing
                                                                                                    Clustering

                                                                     Extract raw
                                                                    spot intensities

                                                                                                Determine
                          1 spot = 1 gene                                                       Expression
                          Expression level:                      Image processing                Pattern
  Condition B             Green: A > B
                          Red: A < B
                          Black: A = B
      Cell
8.5.2009     population                        Scientific and Grid Workflow (Cesare Pautasso)                                       9
8.5.2009   Scientific and Grid Workflow (Cesare Pautasso)   10
8.5.2009   Scientific and Grid Workflow (Cesare Pautasso)   11
Vision for Scientific and Grid Workflows
   Make it easy to build Grid applications composed of multiple jobs




                                                                “          Provide the scientist
                                                                           with a platform that
                                                                           takes care of all data
                                                                           handling and record
                                                                           keeping chores so
                                                                           that the user can
                                                                           concentrate on the
                                                                           science and not
                                                                           computer science

8.5.2009                  Scientific and Grid Workflow (Cesare Pautasso)
                                                                                              ”     12
Workflow

            GRID
8.5.2009     Scientific and Grid Workflow (Cesare Pautasso)   13
Some (Scientific) Workflow Management Systems
 Askalon            GWFE                         Pegasus                 Triana
 Bigbross Bossa     GWES                         Pipeline Pilot          Trident
 BioPipe            ICENI                        P-GRADE                 Twister
 BPMN               Inforsense                   PowerFolder             Ultimus
 Breeze             JIGSA                        Ptolemy II              Versata
 Carnot             JOpera                       Savvion                 Viztrails
 Con:cern           Kepler                       Seebeyond               wftk
 DAGMan             Karajan                      SCIRun                  XFlow
 DiscoveryNet       Oakgrove's                   ScyFLOW                 YAWL
 Dralasoft             reactor                   SDSC Matrix             Wildfire
 GEL                OSIRIS                       SHOP2                   WFEE
 GridAnt            OSWorkflow                   Taverna                 WS-BPEL
 Grid Job Handler   OpenWFE                      Teuta (UML)             ZBuilder
8.5.2009                Scientific and Grid Workflow (Cesare Pautasso)               14
Scientific vs. Grid vs. Business
                      Workflows




8.5.2009              Scientific and Grid Workflow (Cesare Pautasso)   15
The Origins: Business Process Management




   who has to do what, when


8.5.2009         Scientific and Grid Workflow (Cesare Pautasso)   16
The Origins: Business Process Management
 •         A business process describes key procedures within an organization.
           They involve:
            • multiple steps
            • numerous people
            • large amounts of resources
 •         In large business organizations there are many factors that increase
           the complexity of the business processes:
            •   processes are not well documented
            •   conformance to rules not guaranteed
            •   people lack information about context
            •   company lacks monitoring tools
            •   steps, people and resources are not properly coordinated
 •         Workflow Management Systems try to address these problems by
           automating the coordination aspects of a business process:
           who has to do what, when, and with which software tools.
8.5.2009                              Scientific and Grid Workflow (Cesare Pautasso)   17
Business Workflows



           “  The automation of a
           business process where
           documents, information to be
           processed or tasks to be carried
           out are passed from one
           participant to another following a
           set of procedural rules

               Worfklow Management Coalition (WfMC, 1993)
                                                                      ”
8.5.2009             Scientific and Grid Workflow (Cesare Pautasso)       18
Scientific Workflows

               are networks of analytical
           “
           steps that may involve, e.g.,
           database access and querying,
           data analysis and mining, and
           many other steps including
           computationally intensive jobs
           submitted to high performance
           clusters and Grids
                                                       ”        Bertram Ludäscher
8.5.2009           Scientific and Grid Workflow (Cesare Pautasso)                   19
Modeling Workflows                                                      activity
                                                                                4                             0
                           activity
                                  1
                                                                                                  2               1

     User
                       Software                       User                             Software
                       Tool                                                            Tool

                              Control Flow                                                            3           4
                                                                       Activity
            Activity




                                                                                                      5
                                  Data Flow
                                                                       Data
            Data
                                                                                                          6

8.5.2009                              Scientific and Grid Workflow (Cesare Pautasso)                                  20
Business Workflows                                                  activity
                                                                            4                             0
                       activity
                            1
                                                                                              2               1

     User
                                                  User                             Software
                                                                                   Tool

                          Control Flow                                                            3           4
                                                                   Activity
            Activity




                                                                                                  5


                                                                                                      6

8.5.2009                          Scientific and Grid Workflow (Cesare Pautasso)                                  21
Scientific Workflows                                                   activity
                                                                               4                             0
                          activity
                                 1
                                                                                                 2               1


                      Software                                                        Software
                      Tool                                                            Tool

                             Control Flow                                                            3           4
                                                                      Activity
           Activity




                                                                                                     5
                                 Data Flow
                                                                      Data
           Data
                                                                                                         6

8.5.2009                             Scientific and Grid Workflow (Cesare Pautasso)                                  22
Similarities:
Are scientists doing e-Business?

           Capturing knowledge/best practices
Capture business processes within a company
                                        Capture scientific experiments

      Executable Models for Repeated Execution
Run a well defined procedure many times
                           Ensure that an experiment can be reproduced

      Incorporate human decision in the process
Can we always do straight-through processing?
                                        Hard to achieve full automation


                                                                          23
Differences:
Do scientists need business transactions?

                         Rate of change
Changing business procedures requires management approval
                  Exploratory scientific processes require high flexibility

                      Which kind of data?
Travel reservations, Loan applications
           Large protein sequence databases, Astronomy image catalogs

                  What is the ultimate goal?
Making profit
                                                           Making science


                                                                              24
Scientific vs. Grid Workflows
Scientific workflows                 Grid workflows focus on the
emphasize the design of              large-scale execution of
virtual experiments:                 scientific workflows:
 • Data flow models                   • Mapping and adaptation to a
 • Reusable “scientific computing”      dynamic run-time environment
   component library                  • Provide access to shared
 • Interactive debugging,               workflows as a Grid service
   monitoring and steering            • Parameterized Execution
 • Data provenance and lineage        • Centralized vs. Distributed
   tracking for reproducibility         Execution Architectures
 • Model versioning for               • Fault Tolerance
   exploratory customization          • Optimization

                                                                       25
Scientific Workflows on the Grid
• How can Scientific WF benefit from the Grid?
   1. Leverage underlying Grid middleware:
     •   Resource Management
     •   Job Scheduling
     •   Large Data Transfers (GridFTP) between Activities
   2. Improved QoS based on the workflow model
     •   Grid resource reservation
     •   Data replication
     •   Data placement
     •   Fault Tolerance



                                                             26
http://www.jopera.org/




                          Example




8.5.2009                 Scientific and Grid Workflow (Cesare Pautasso)   27
A Web Service-Enabled Workflow
System for Climate Modeling Data
     Processing in TeraGrid

            Rajesh Kalyanam
                Lan Zhao
              Taezoon Park
           Sebastien Goasguen




                                   28
Architecture
  Application                        Desktop
                                                       Web Poral
                                    Application


  Workflow Engine
                                                  JOpera


  Workflow Components
   Data Components                                               Computation Components

     Metadata         Data        Data          Data               Globus                   Models /
                                                                              Condor Job
      Query        Discovery    Transfer   Transformation           Job                      Tools



  Purdue Data Management System              Remote Data Proxies       Computation Middleware




                                                                                                       From Rajesh Kalyanam
   OPeNDAP      THREDDS        SRB/MCAT             Data Proxy              Condor-G       Globus



  Local Datasets                              Remote Datasets          Computation Resources
                               Climate            Remote
   LARS      PTO      NWS                                               Local Clusters      TeraGrid
                               Modeling           Datasets



                                                                                                          29
Portal




     From Rajesh Kalyanam
30
Workflow Model




     From Rajesh Kalyanam
31
Workflow Execution




                     From Rajesh Kalyanam
                        32
Workflow Results




     From Rajesh Kalyanam
33
Workflow Lifecycle




8.5.2009       Scientific and Grid Workflow (Cesare Pautasso)   34
Workflow Lifecycle
                     Modeling                     Workflow Execution


            Scientific
                                         Workflow                            Workflow
           Computation
                                          Model                              Instances




                                                                                         From Gustavo Alonso
                    Simulation                                       Log Analysis


8.5.2009                   Scientific and Grid Workflow (Cesare Pautasso)                   35
Workflow Modeling Methodologies




8.5.2009           Scientific and Grid Workflow (Cesare Pautasso)   36
Bottom up Composition
  4. Share and Publish it as Web Service

  3. Run, Test, and Debug the execution
     within the same modeling environment

  2. Build a workflow using a drag, drop
     and connect modeling environment

    1. Select components from a library
           a. Lookup services in a public registry
           b. Import from external Web service (WSDL)
           c. Search the standard library

8.5.2009                       Scientific and Grid Workflow (Cesare Pautasso)   37
Top down Decomposition
       1. Define a goal and Draw a skeleton of the
          workflow that satisfies it
       2. Refine it and Bind services into it:
           •   Search for existing matching services
           •   Build missing services (if necessary)
           •   Add required data transformations
       3. Run, Test, and Debug the execution
          within the same modeling environment
       4. Share and Publish it as Web Service



8.5.2009                       Scientific and Grid Workflow (Cesare Pautasso)   38
Iterative Composition
           Change, Rediscover
           Build New services rvices                                              Refactor
                           er se
                       cov
                   Dis

                                                                                  Model
           Manage                                                                 Service
                                                                                  Composition

                    Deploy
                    Run, Test
                                                                                 Check
                                    Compile
8.5.2009                        Scientific and Grid Workflow (Cesare Pautasso)                  39
Workflow Modeling Languages
               and Tools Overview




8.5.2009            Scientific and Grid Workflow (Cesare Pautasso)   40
HeNCE - The Ancestor of Grid Workflows?




      A. Beguelin, J. J. Dongarra, G. A. Geist, R. Manchek, V. S. Sunderam, Graphical Development Tools for Network-Based Concurrent
      Supercomputing, in: Proc. of the 1991 ACM/IEEE conference on Supercomputing, Albuquerque, New Mexico, 1991, pp. 435–444.


8.5.2009                                              Scientific and Grid Workflow (Cesare Pautasso)                                   41
request : Inbox



                 <<transformation>>                                 <<transformation>>
                 symbol = Symbol;                                   country1 := 'USA';
                                                                    country2 = Currency;

symbol : getQuoteRequest1
                                                                                                        rateRequest : getRateRequest




                           <<WebServiceCall>>                                                                         <<WebServiceCall>>
                                getQuote                                                                                    getRate
             {service=net.xmethods.services.stockquote.StockQuoteService,                       {wsdl=http://www.xmethods.net/sd/2001/CurrencyExchangeService.wsdl,
                                                      operation=getQuote,                                                          service=CurrencyExchangeService,
wsdl=http://services.xmethods.net/soap/urn:xmethods-delayed-quotes.wsdl,                                                         portType=CurrencyExchangePortType,
          portType=net.xmethods.services.stockquote.StockQuotePortType}                                                                           operation=getRate}




    originalPrice : getQuoteResponse1                                  conversionRate : getRateResponse



             <<transformation>>                                                <<transformation>>
             a = Result;                                                       b = Result;
                                                                                                              Extended UML
    twoNumbers : MultiplyInbox                                                                                Activity Diagrams

                                        <<ImmediateStep>>




                                                                                                                                                                  From Roy Gronmo
                                                                     product : MultiplyOutbox
                                             Multiply



       <<transformation>>                                     <<transformation>>
      OriginalPrice = Result;                                 ConvertedPrice = c;



                                   response : Outbox
ExampleStockQuoteConvert Input                          JOpera
                                                                      Data Flow
    usa
                    country                         symbol
                                                                       Graph


country1         country2                               symbol


      ?
       getRate                                          getQuote


           Result       b                       a            Result


                              DataIntegration


                                      c
                                                     quote



                                    ExampleStockQuoteConvert Output
Triana




8.5.2009   Scientific and Grid Workflow (Cesare Pautasso)      44
Taverna




8.5.2009   Scientific and Grid Workflow (Cesare Pautasso)             45
VizTrails




8.5.2009   Scientific and Grid Workflow (Cesare Pautasso)               46
Trident
OSIRIS




8.5.2009            Scientific and Grid Workflow (Cesare Pautasso)   48
JOpera

8.5.2009   Scientific and Grid Workflow (Cesare Pautasso)   49
Grid Workflow Language
                  Patterns




8.5.2009         Scientific and Grid Workflow (Cesare Pautasso)   50
Workflow Pattern                        Variants

                                                        Implicit
           1. Simple Parallelism
                                                        Explicit

                                                  Static
           2. Data Parallelism
                                                  Dynamic

                                    Best Effort
                                    Blocking
                                                                     Hybrid
           3. Pipelining            Buffered
                                                                            Synchronized
                                    Superscalar
                                                                            Out of Order
                                    Streaming
8.5.2009                   Scientific and Grid Workflow (Cesare Pautasso)                  51
Modeling Simple Parallelism
Data Flow, Graph Based




                                                                      SCIRun
                                                                      Kepler
                                                                      Triana
8.5.2009             Scientific and Grid Workflow (Cesare Pautasso)            52
Modeling Simple Parallelism
Control Flow, Graph Based




   JOpera
    GEL                                                               UML

8.5.2009             Scientific and Grid Workflow (Cesare Pautasso)         53
Modeling Simple Parallelism
Control Flow, Block Based




     BPMN                                                              WS-BPEL
8.5.2009              Scientific and Grid Workflow (Cesare Pautasso)             54
Modeling Data Parallelism
Data Flow, Graph Rewriting




           Static or Dynamic

       Triana          Taverna
                       JOpera

8.5.2009                       Scientific and Grid Workflow (Cesare Pautasso)   55
Modeling Data Parallelism
Control Flow, Block Based, Dynamic




                                                                       WS-BPEL
                                                                        AGWL
                                                                        Karajan
                                                                         GEL

8.5.2009              Scientific and Grid Workflow (Cesare Pautasso)              56
Modeling Pipelined Execution
                                                                1
                                                                            2
           1, 2, 3, …
                                                                                3




8.5.2009                   Scientific and Grid Workflow (Cesare Pautasso)           57
Pipelining Semantics




8.5.2009       Scientific and Grid Workflow (Cesare Pautasso)   58
Best Effort Pipelined Execution



           Drop data elements on pipeline collisions
           Advantages:
              •   Simplified implementation
              •   Some applications may tolerate data loss
           Problem:
              •   Downsampling is non deterministic



8.5.2009                       Scientific and Grid Workflow (Cesare Pautasso)   59
Blocking Pipelined Execution




           Tasks are blocked if successors are busy
           Advantages:
              •   Avoid data loss in the pipeline
           Problem:
              •   Pipeline speed limited by slowest task
              •   Data may be lost before it enters the pipeline



8.5.2009                      Scientific and Grid Workflow (Cesare Pautasso)   60
Buffered Pipelined Execution




           Tasks are decoupled by buffers
           Advantages:
              •    Collisions are prevented
              •    Best applied to tasks having variable speed
           Problem:
              •    Buffer capacity is limited
                   (Blocking still needed – Hybrid semantics)


8.5.2009                      Scientific and Grid Workflow (Cesare Pautasso)   61
Streaming Pipelined Execution



           Tasks exchange data while running
           Advantages:
              •   Suitable for a distributed (P2P) engine
           Problems:
              •   Shifts complexity from the workflow engine to the tasks
              •   Tasks exchange data while running
              •   Workflow/Task interface more complex


8.5.2009                      Scientific and Grid Workflow (Cesare Pautasso)   62
Running Workflows on the Grid




8.5.2009             Scientific and Grid Workflow (Cesare Pautasso)   63
Workflow Model                                                     Basic Architecture
                   Act 2     Act 4

           Act 1             Act 5               Act 7

                   Act 3     Act 6

                                                     Workflow Management System



                            Adapters

                                Grid
                           Schedulers
                                                                                      Grid
                                                                                      Resources

8.5.2009                             Scientific and Grid Workflow (Cesare Pautasso)                        64
Workflow          Workflow
                     Workflow Model                                            Users             Participants
                     Act 2      Act 4

           Act 1                Act 5               Act 7

                     Act 3      Act 6

                                                        Workflow Management System

                   Workflow
                   Modelers
                               Adapters

                                   Grid
                              Schedulers                                                              Wrapper
   Scientific                                                                                         Developers
                                                                                         Grid
   Software
                                                                                         Resources
   Developers
8.5.2009                                Scientific and Grid Workflow (Cesare Pautasso)                             65
Standard APIs                                 Process
                                             Definition Tools


                                   Interface 1

                            Workflow API and Interchange formats                           Interface 4
              Interface 5
                                   Workflow Enactment Service                                    Other Workflow
    Administration                                                                                Enactment Service(s)
     & Monitoring
       Tools                              Workflow                                                   Workflow
                                          Engine(s)                                                  Engine(s)




                                                                                                                     Workflow Reference Model, 1998
                     Interface 2                                                  Interface 3

                              Workflow
                                                                   Invoked




                                                                                                                     From WFMC,
                               Client
                                                                  Applications
                             Applications

8.5.2009                               Scientific and Grid Workflow (Cesare Pautasso)                                                 66
Wrappers and Grid Applications
                                                                                                  Worklist
            Act X        Workflow Management System                                               Handler
                                                            adapter


                                                                                                         Application
                                 wrapper                                 Grid                                 X
                                                                       Scheduler
           Application
                X


                                Application                                         Application
                                     X                                                   X

8.5.2009                           Scientific and Grid Workflow (Cesare Pautasso)                                      67
Wrappers and Legacy Applications
  •        The workflow engine is also in charge of connecting the different
           scientific applications.
  •        These applications do not have to talk directly to each other, they do it
           through the workflow engine.
  •        Most engines target a service oriented applications for which they
           provide very good connectivity through standardized protocols.
           Otherwise, the interface adapters must be developed on a case by case
           basis (as a last resort manual integration may be required!)
  •        For legacy application, a wrapper must be built so that the workflow
           engine can communicate with the application. The wrapper can be a
           simple relay of commands and data, or a complete translation program
           implementing functionality not present in the legacy application.
  •        For most Grid applications, the interaction takes place through a Grid
           scheduler, which is responsible for managing the distributed execution of
           the applications.

8.5.2009                           Scientific and Grid Workflow (Cesare Pautasso)      68
Run-time Abstraction Levels




8.5.2009          Scientific and Grid Workflow (Cesare Pautasso)   69
Run-time Abstraction Levels
  •        A design-time workflow model needs to be mapped across different
           abstraction levels in order to be executed at run time.
  •        User request the execution of a new workflow instance.
  •        The abstract workflow is mapped to an executable instance by:
            • Finding suitable service implementations and binding them to the tasks
            • Rewriting the workflow graph based on a set of refinement rules
            • Planning required data staging, registration, placement, replication and
              transfer operations
  •        Each task of the resulting executable workflow is then submitted to a
           Grid resource manager so that it can be scheduled on suitable resources
  •        The mapping can be done:
            • when the workflow is started at instantiation time (statically)
            • incrementally as the workflow runs (adaptive execution with dynamic late
              binding)

8.5.2009                             Scientific and Grid Workflow (Cesare Pautasso)      70
Example: Binding with WS-BPEL


                            tion ti   me                               set of services
                 co mp o s i                                       (BPEL partner link type)

                                                                        one service
                  deployment time
                                                                      (WSDL port type)
      Activity
                                                                         service
                                                                        end point
                                                                          (port)
                       invocation time




8.5.2009                              Scientific and Grid Workflow (Cesare Pautasso)          71
Workflow Binding Lifecycle

           Library Registration time (classification)
           Modeling time (static early binding)
           Compilation time (blacklisting)
           Deployment time (customization)
           Startup time (testing)
           Task Execution time (dynamic late binding)
           Failed invocation time (rebind on retry)

8.5.2009                 Scientific and Grid Workflow (Cesare Pautasso)   72
http://www.jopera.org/




                       JOpera
           Scientific Workflow for Eclipse




8.5.2009                 Scientific and Grid Workflow (Cesare Pautasso)   73
            High Level Workflow Language
                 Data and Control Aspects (Visual Representation)
                 Recursion, Iteration, Parallelism and Pipelining

            Open and Extensible Component Model
                 Run existing code without changes
                 Synchronous, Asynchronous, and Streaming interaction
                 Web services support (Axis, WSIF)
                 Secure access to remote file systems and hosts (SSH)
                 Easy to integrate with existing schedulers (e.g. Condor)




6 February 2009                   ©2009 Cesare Pautasso | www.pautasso.info   74
            High Level Workflow Language
                 Data and Control Aspects (Visual Representation)
                 Recursion, Iteration, Parallelism and Pipelining

            Open and Extensible Component Model
                 Run existing code without changes
                 Synchronous, Asynchronous, and Streaming interaction
                 Web services support (Axis, WSIF)
                 Secure access to remote file systems and hosts (SSH)
                 Easy to integrate with existing schedulers (e.g. Condor)

            Strong Eclipse Foundation
                 Platform Independent (Eclipse/Java)
                 Flexible, Extensible, Modular and Embeddable

6 February 2009                   ©2009 Cesare Pautasso | www.pautasso.info   75
JOpera Visual Composition Language
           Workflows are modeled using multiple viewpoints:

 1. Data Flow Graph                                  2. Control Flow Graph                          3. Service Bindings

   isbn             QueryBookPrice                                  QueryBookPrice
                                                                                                       QueryBookPrice
                    e
                    c
                    r
                    P
                    -
                    s
                    b
                    o
                    J




                e
                c
                r
                P
                -
                s
                b
                o
                J




                        e
                        c
                        r
                        P
                        -
                        s
                        b
                        o
                        J




                                                                                                        Amazon.com
                                price
                                               t
                                               s
                                               u
                                               q
                                               e
                                               R

                                           o
                                           u
                                           i
                                           t
                                           a
                                           n
                                           v
                                           j
                                           e
                                           R




                                                                  CurrencyConvert
                                                                                                       CurrencyConvert
                                amount
                                                                                                           XE.com
                            z
                            i
                            o
                            l
                            n
                            a
                            t
                            U




            z
            i
            o
            l
            n
            a
            t
            U




                    CurrencyConvert      amount
                                                                           Exception Handler




8.5.2009                                           Scientific and Grid Workflow (Cesare Pautasso)                        76
JOpera Example: Doodle Map Mashup
           Setup a Doodle with Yahoo! Local search and visualize
             the results of the poll on Google Maps




8.5.2009                      Scientific and Grid Workflow (Cesare Pautasso)   77
Doodle Map Mashup Architecture
       Web Browser                          Workflow                                        RESTful
                                             Engine                                       Web Services
                     RESTful API                                                             APIs
                                                                                    GET

                                                                               POST
                                                                                GET




8.5.2009                           Scientific and Grid Workflow (Cesare Pautasso)                        78
8.5.2009   Scientific and Grid Workflow (Cesare Pautasso)   79
Extensible JOpera Component Model
Combine in the same workflow jobs implemented using an open
   and extensible set of technologies



                  JOpera Workflow

   WSDL Java Human XML SQL SSH Condor

            •Snippets                •XSLT
            •Methods                 •XPath
8.5.2009             Scientific and Grid Workflow (Cesare Pautasso)   80
Sharing Workflows as a Service
JOpera processes are automatically published to clients using a
   variety of access protocols

                                                                         Eclipse RCP
       Web Clients        WS Clients
                                                                         Clients

              REST                WSDL                                   Java

                     JOpera Workflow

  WSDL Java Human XML SQL SSH Condor
 8.5.2009               Scientific and Grid Workflow (Cesare Pautasso)                 81
JOpera ARC Integration Demo




8.5.2009                         Scientific and Grid Workflow (Cesare Pautasso)   82
Workflows and Provenance




8.5.2009           Scientific and Grid Workflow (Cesare Pautasso)   83
Lineage in Scientific Workflows

           Scientists consider the “capture and
             generation of provenance
             information as a critical part of the
             workflow-generated data”
           “Sharing workflows is an essential
             element of education, and
             acceleration of knowledge
             dissemination.”
                                                                        Ewa Deelman et al.

8.5.2009               Scientific and Grid Workflow (Cesare Pautasso)                        84
Where does this picture come from?
                                                           METADATA
                                                          This photo was taken
                                                          July 21, 1981,
                                                          when the Voyager 2
                                                          spacecraft
                                                          was 33.9 million km
                                                          from the Saturn planet




8.5.2009          Scientific and Grid Workflow (Cesare Pautasso)                   85
Is this the right metadata?                               METADATA
                                                           Date: 16.4.2005
                                                           Dimension: 640x480
                                                           Colors: 32bits
                                                           Size: 1.2MB
                                                           Format: JPEG
                                                           Camera Model: D100
                                                           Equipment Make: Nikon
                                                           Flash Used: No
                                                           Focal Length: 8.2 mm
                                                                     FOR
                                                           F-Number: F/2.6

    Title: White Arabian Horse                                       SALE
                                                           Exposure: 1/100 s
                                                           Metering: pattern
                                                           GPS: 50.2 Lat. 60.4 Lon.
8.5.2009            Scientific and Grid Workflow (Cesare Pautasso)                    86
Would you buy a horse without this?




8.5.2009          Scientific and Grid Workflow (Cesare Pautasso)   87
Lineage in Spreadsheets




8.5.2009         Scientific and Grid Workflow (Cesare Pautasso)   88
Lineage in Spreadsheets




8.5.2009         Scientific and Grid Workflow (Cesare Pautasso)   89
Lineage in Spreadsheets




8.5.2009         Scientific and Grid Workflow (Cesare Pautasso)   90
Lineage in Spreadsheets




8.5.2009         Scientific and Grid Workflow (Cesare Pautasso)   91
Lineage in Databases
 What is the relationship between these tuples?




                                      SQL


                                                                        Problem:
                                                                        Query
                                                                        Inversion
8.5.2009               Scientific and Grid Workflow (Cesare Pautasso)               92
Lineage in Software Development
 What’s in a Makefile?
CC = gcc
CFLAGS = -Wall -g

program: main.o input.o output.o logic.o
       $(CC) $(CFLAGS) main.o input.o output.o logic.o -o program

main.o: main.c input.h output.h logic.h
       $(CC) $(CFLAGS) -c main.c
input.o: input.c input.h
       $(CC) $(CFLAGS) -c input.c
output.o: output.c output.h
       $(CC) $(CFLAGS) -c output.c
logic.o: logic.c logic.h
       $(CC) $(CFLAGS) -c logic.c

8.5.2009                 Scientific and Grid Workflow (Cesare Pautasso)   93
Lineage in Software Development
 Where does my program come from?
CC = gcc
CFLAGS = -Wall -g

program: main.o input.o output.o logic.o
       $(CC) $(CFLAGS) main.o input.o output.o logic.o -o program

main.o: main.c input.h output.h logic.h
       $(CC) $(CFLAGS) -c main.c
input.o: input.c input.h
       $(CC) $(CFLAGS) -c input.c
output.o: output.c output.h
       $(CC) $(CFLAGS) -c output.c
logic.o: logic.c logic.h
       $(CC) $(CFLAGS) -c logic.c

8.5.2009               Scientific and Grid Workflow (Cesare Pautasso)   94
Lineage in Scientific Workflows


      Input    Theory                                                    An ideal scientific
      Data                                                               workflow should
                                                                         document all of
                    Output      Published                                the steps linking
                     Data       Paper                                    the original
                                                                         observations with
                Scientific Workflow                                      the final published
                                                                         results so that the
       Input   Observation                                               process can be
       Data                                                              reproduced

8.5.2009                Scientific and Grid Workflow (Cesare Pautasso)                         95
Data Provenance




                                                                        Where does this output document come from?
                                                                    ?



8.5.2009           Scientific and Grid Workflow (Cesare Pautasso)                                                    96
Change Propagation




                                                                 What to recompute if this input changes?
8.5.2009
           !    Scientific and Grid Workflow (Cesare Pautasso)                                              97
Conclusion




8.5.2009    Scientific and Grid Workflow (Cesare Pautasso)   98
Workflow
                                               and                                       Modeling
           Reuse                           Component
                                            Libraries                                            Data,
             Data
           Products                                                                             Metadata
                                                                                                Catalogs
                                                                           Populate
                         Adapt,                   Workflow
                                                                           with data
                         Modify                   Template

                                                                              Workflow
                      Data, Metadata,
                                                                              Instance
                       Provenance
                       Information

                                                   Executable                   Map to
                          Execute                                              available             Resource ,
Execution                                           Workflow
                                                                              resources             Application




                                                                                                                   From Ewa Deelman
                                                                                                    Component
             Compute,                                                                               Descriptions
              Storage
  Distributed and
              Network
             Resources                                                                    Mapping
8.5.2009                                Scientific and Grid Workflow (Cesare Pautasso)                               99
Executed
     e-Science as Workflow?
                                                                          Executing
                                                                          Executable
                    Provenance Query
                                                                          Not yet executable


           What I                                       What I
            Did                                        Am Doing                       Model

                                                  What I                       …
                                                 Want to Do




                                                                                              From Ian Foster
           Execution environment                        Schedule
8.5.2009                 Scientific and Grid Workflow (Cesare Pautasso)                       100
Some References    Gil, Y. et al.; Examining the Challenges of
                    Scientific Workflows. IEEE Computer, Dec 2007
                    Taylor, I.J.; Deelman, E.; Gannon, D.B.; Shields,
                    M. (Eds.) Workflows for e-Science: Scientific
                    Workflows for Grids, Springer 2007
                    Yu, J.; Buyya, R.: A taxonomy of workflow
                    management systems for grid computing,
                    Journal of Grid Computing, 3(3–4):171–200
                    (2005)
                    Pautasso, C.; Alonso, G.: Parallel Computing
                    Patterns for Grid Workflows, Proc. Of
                    WORKS@HPDC06, Paris, France, 2006
                    OGF Workflow Research Group
                    http://www.isi.edu/~deelman/wf–rg/
                    Download This Tutorial Material
                    http://www.pautasso.info/lectures/sgs09workflow.pdf

8.5.2009           Scientific and Grid Workflow (Cesare Pautasso)         101
Free Download




http://www.jopera.org/




 6 February 2009   ©2009 Cesare Pautasso | www.pautasso.info   102

More Related Content

What's hot

Translational bioinformatics at VHIR: Understanding molecular damage in Fabry...
Translational bioinformatics at VHIR: Understanding molecular damage in Fabry...Translational bioinformatics at VHIR: Understanding molecular damage in Fabry...
Translational bioinformatics at VHIR: Understanding molecular damage in Fabry...Vall d'Hebron Institute of Research (VHIR)
 
Decision making in management for large medical equipment
Decision making in management for large medical equipmentDecision making in management for large medical equipment
Decision making in management for large medical equipmentHTAi Bilbao 2012
 
Decision making in management for large medical equipment
Decision making in management for large medical equipmentDecision making in management for large medical equipment
Decision making in management for large medical equipmentHTAi Bilbao 2012
 
Ph d colloquium 2
Ph d colloquium 2Ph d colloquium 2
Ph d colloquium 2Rishi Roy
 
Stephen Friend Institute of Development, Aging and Cancer 2011-11-28
Stephen Friend Institute of Development, Aging and Cancer 2011-11-28Stephen Friend Institute of Development, Aging and Cancer 2011-11-28
Stephen Friend Institute of Development, Aging and Cancer 2011-11-28Sage Base
 

What's hot (7)

Translational bioinformatics at VHIR: Understanding molecular damage in Fabry...
Translational bioinformatics at VHIR: Understanding molecular damage in Fabry...Translational bioinformatics at VHIR: Understanding molecular damage in Fabry...
Translational bioinformatics at VHIR: Understanding molecular damage in Fabry...
 
Decision making in management for large medical equipment
Decision making in management for large medical equipmentDecision making in management for large medical equipment
Decision making in management for large medical equipment
 
Decision making in management for large medical equipment
Decision making in management for large medical equipmentDecision making in management for large medical equipment
Decision making in management for large medical equipment
 
Ph d colloquium 2
Ph d colloquium 2Ph d colloquium 2
Ph d colloquium 2
 
Stephen Friend Institute of Development, Aging and Cancer 2011-11-28
Stephen Friend Institute of Development, Aging and Cancer 2011-11-28Stephen Friend Institute of Development, Aging and Cancer 2011-11-28
Stephen Friend Institute of Development, Aging and Cancer 2011-11-28
 
Auxetics and folding
Auxetics and foldingAuxetics and folding
Auxetics and folding
 
Trapped by MTBF
Trapped by MTBFTrapped by MTBF
Trapped by MTBF
 

Viewers also liked

Comparison writing
Comparison writingComparison writing
Comparison writingheejjung
 
De gran voldria ser
De gran voldria serDe gran voldria ser
De gran voldria serniky6789
 
Ebert essay
Ebert essayEbert essay
Ebert essayheejjung
 
Script.docx
Script.docxScript.docx
Script.docxheejjung
 
Управление бизнесом на основе данных
Управление бизнесом на основе данныхУправление бизнесом на основе данных
Управление бизнесом на основе данныхKonstantin Savenkov
 
Conflict case study
Conflict case studyConflict case study
Conflict case studyheejjung
 

Viewers also liked (8)

Comparison writing
Comparison writingComparison writing
Comparison writing
 
De gran voldria ser
De gran voldria serDe gran voldria ser
De gran voldria ser
 
c.harris work
c.harris workc.harris work
c.harris work
 
Interview tips
Interview tipsInterview tips
Interview tips
 
Ebert essay
Ebert essayEbert essay
Ebert essay
 
Script.docx
Script.docxScript.docx
Script.docx
 
Управление бизнесом на основе данных
Управление бизнесом на основе данныхУправление бизнесом на основе данных
Управление бизнесом на основе данных
 
Conflict case study
Conflict case studyConflict case study
Conflict case study
 

Similar to Scientific and Grid Workflow Management (SGS09)

Leveraging The Open Provenance Model as a Multi-Tier Model for Global Climate...
Leveraging The Open Provenance Model as a Multi-Tier Model for Global Climate...Leveraging The Open Provenance Model as a Multi-Tier Model for Global Climate...
Leveraging The Open Provenance Model as a Multi-Tier Model for Global Climate...Eric Stephan
 
Sc08 Talk Final
Sc08 Talk FinalSc08 Talk Final
Sc08 Talk Finalmrmeswani
 
Testing Tasks and Blooms Taxonomy
Testing Tasks and Blooms TaxonomyTesting Tasks and Blooms Taxonomy
Testing Tasks and Blooms TaxonomyScott Barber
 
Welsh_BioHDF_BOSC2009
Welsh_BioHDF_BOSC2009Welsh_BioHDF_BOSC2009
Welsh_BioHDF_BOSC2009bosc
 
my poster presentation in the jcms2011 conference
my poster presentation in the jcms2011 conferencemy poster presentation in the jcms2011 conference
my poster presentation in the jcms2011 conferencePawitra Masa-ah
 
Subspace Identification
Subspace IdentificationSubspace Identification
Subspace Identificationaileencv
 
Bayesian network based software reliability prediction
Bayesian network based software reliability predictionBayesian network based software reliability prediction
Bayesian network based software reliability predictionJULIO GONZALEZ SANZ
 
Debs Presentation 2009 July62009
Debs Presentation 2009 July62009Debs Presentation 2009 July62009
Debs Presentation 2009 July62009Opher Etzion
 
U.S. Nuclear Facilities - Annie Kammerer
U.S. Nuclear Facilities - Annie KammererU.S. Nuclear Facilities - Annie Kammerer
U.S. Nuclear Facilities - Annie KammererEERI
 
Research Inventy : International Journal of Engineering and Science is publis...
Research Inventy : International Journal of Engineering and Science is publis...Research Inventy : International Journal of Engineering and Science is publis...
Research Inventy : International Journal of Engineering and Science is publis...researchinventy
 
Sahara icsm 2011
Sahara icsm 2011Sahara icsm 2011
Sahara icsm 2011rnbach
 
High-Throughput Screening of mAb Charge Variants Using Microchip-CZE
High-Throughput Screening of mAb Charge Variants Using Microchip-CZEHigh-Throughput Screening of mAb Charge Variants Using Microchip-CZE
High-Throughput Screening of mAb Charge Variants Using Microchip-CZEPerkinElmer, Inc.
 
Is the current model of load testing broken ukcmg - steve thair
Is the current model of load testing broken   ukcmg - steve thairIs the current model of load testing broken   ukcmg - steve thair
Is the current model of load testing broken ukcmg - steve thairStephen Thair
 
Tim Malthus_Towards standards for the exchange of field spectral datasets
Tim Malthus_Towards standards for the exchange of field spectral datasetsTim Malthus_Towards standards for the exchange of field spectral datasets
Tim Malthus_Towards standards for the exchange of field spectral datasetsTERN Australia
 
Smart debugger
Smart debuggerSmart debugger
Smart debuggerTao He
 

Similar to Scientific and Grid Workflow Management (SGS09) (20)

ICSM08a.ppt
ICSM08a.pptICSM08a.ppt
ICSM08a.ppt
 
Leveraging The Open Provenance Model as a Multi-Tier Model for Global Climate...
Leveraging The Open Provenance Model as a Multi-Tier Model for Global Climate...Leveraging The Open Provenance Model as a Multi-Tier Model for Global Climate...
Leveraging The Open Provenance Model as a Multi-Tier Model for Global Climate...
 
Sc08 Talk Final
Sc08 Talk FinalSc08 Talk Final
Sc08 Talk Final
 
Testing Tasks and Blooms Taxonomy
Testing Tasks and Blooms TaxonomyTesting Tasks and Blooms Taxonomy
Testing Tasks and Blooms Taxonomy
 
Welsh_BioHDF_BOSC2009
Welsh_BioHDF_BOSC2009Welsh_BioHDF_BOSC2009
Welsh_BioHDF_BOSC2009
 
my poster presentation in the jcms2011 conference
my poster presentation in the jcms2011 conferencemy poster presentation in the jcms2011 conference
my poster presentation in the jcms2011 conference
 
Subspace Identification
Subspace IdentificationSubspace Identification
Subspace Identification
 
[Download] rev chapter-8-june26th
[Download] rev chapter-8-june26th[Download] rev chapter-8-june26th
[Download] rev chapter-8-june26th
 
Bayesian network based software reliability prediction
Bayesian network based software reliability predictionBayesian network based software reliability prediction
Bayesian network based software reliability prediction
 
Debs Presentation 2009 July62009
Debs Presentation 2009 July62009Debs Presentation 2009 July62009
Debs Presentation 2009 July62009
 
U.S. Nuclear Facilities - Annie Kammerer
U.S. Nuclear Facilities - Annie KammererU.S. Nuclear Facilities - Annie Kammerer
U.S. Nuclear Facilities - Annie Kammerer
 
Research Inventy : International Journal of Engineering and Science is publis...
Research Inventy : International Journal of Engineering and Science is publis...Research Inventy : International Journal of Engineering and Science is publis...
Research Inventy : International Journal of Engineering and Science is publis...
 
Data-Intensive Research
Data-Intensive ResearchData-Intensive Research
Data-Intensive Research
 
Sahara icsm 2011
Sahara icsm 2011Sahara icsm 2011
Sahara icsm 2011
 
DAC 2012
DAC 2012DAC 2012
DAC 2012
 
High-Throughput Screening of mAb Charge Variants Using Microchip-CZE
High-Throughput Screening of mAb Charge Variants Using Microchip-CZEHigh-Throughput Screening of mAb Charge Variants Using Microchip-CZE
High-Throughput Screening of mAb Charge Variants Using Microchip-CZE
 
Is the current model of load testing broken ukcmg - steve thair
Is the current model of load testing broken   ukcmg - steve thairIs the current model of load testing broken   ukcmg - steve thair
Is the current model of load testing broken ukcmg - steve thair
 
Tim Malthus_Towards standards for the exchange of field spectral datasets
Tim Malthus_Towards standards for the exchange of field spectral datasetsTim Malthus_Towards standards for the exchange of field spectral datasets
Tim Malthus_Towards standards for the exchange of field spectral datasets
 
Smart debugger
Smart debuggerSmart debugger
Smart debugger
 
March 2013 Introduction
March 2013 IntroductionMarch 2013 Introduction
March 2013 Introduction
 

More from Cesare Pautasso

Beautiful APIs - SOSE2021 Keynote
Beautiful APIs - SOSE2021 KeynoteBeautiful APIs - SOSE2021 Keynote
Beautiful APIs - SOSE2021 KeynoteCesare Pautasso
 
How do you back up and consistently recover your microservice architecture?
How do you back up and consistently recover your microservice architecture?How do you back up and consistently recover your microservice architecture?
How do you back up and consistently recover your microservice architecture?Cesare Pautasso
 
Microservices: An Eventually Inconsistent Architectural Style?
Microservices: An Eventually Inconsistent Architectural Style?Microservices: An Eventually Inconsistent Architectural Style?
Microservices: An Eventually Inconsistent Architectural Style?Cesare Pautasso
 
Disaster Recovery and Microservices: The BAC Theorem
Disaster Recovery and Microservices: The BAC TheoremDisaster Recovery and Microservices: The BAC Theorem
Disaster Recovery and Microservices: The BAC TheoremCesare Pautasso
 
The Blockchain as a Software Connector
The Blockchain as a Software ConnectorThe Blockchain as a Software Connector
The Blockchain as a Software ConnectorCesare Pautasso
 
Team Situational Awareness and Architectural Decision Making with the Softwar...
Team Situational Awareness and Architectural Decision Making with the Softwar...Team Situational Awareness and Architectural Decision Making with the Softwar...
Team Situational Awareness and Architectural Decision Making with the Softwar...Cesare Pautasso
 
JOpera - Eclipse-based Visual Composition Environment featuring a general lan...
JOpera - Eclipse-based Visual Composition Environment featuring a general lan...JOpera - Eclipse-based Visual Composition Environment featuring a general lan...
JOpera - Eclipse-based Visual Composition Environment featuring a general lan...Cesare Pautasso
 
Push-Enabling RESTful Business Processes
Push-Enabling RESTful Business ProcessesPush-Enabling RESTful Business Processes
Push-Enabling RESTful Business ProcessesCesare Pautasso
 
Atomic Transactions for the REST of us
Atomic Transactions for the REST of usAtomic Transactions for the REST of us
Atomic Transactions for the REST of usCesare Pautasso
 
Service Oriented Architectures and Web Services
Service Oriented Architectures and Web ServicesService Oriented Architectures and Web Services
Service Oriented Architectures and Web ServicesCesare Pautasso
 
Exploiting Multicores to Optimize Business Process Execution
Exploiting Multicores to Optimize Business Process ExecutionExploiting Multicores to Optimize Business Process Execution
Exploiting Multicores to Optimize Business Process ExecutionCesare Pautasso
 
Real-time Mashups di Web Service Geografici
Real-time Mashups di Web Service GeograficiReal-time Mashups di Web Service Geografici
Real-time Mashups di Web Service GeograficiCesare Pautasso
 
Towards Scalable Service Composition on Multicores
Towards Scalable Service Composition on MulticoresTowards Scalable Service Composition on Multicores
Towards Scalable Service Composition on MulticoresCesare Pautasso
 
WS-* vs. RESTful Services
WS-* vs. RESTful ServicesWS-* vs. RESTful Services
WS-* vs. RESTful ServicesCesare Pautasso
 
RESTful Service Composition with JOpera
RESTful Service Composition with JOperaRESTful Service Composition with JOpera
RESTful Service Composition with JOperaCesare Pautasso
 
USI SCUBE Associate Member
USI SCUBE Associate MemberUSI SCUBE Associate Member
USI SCUBE Associate MemberCesare Pautasso
 

More from Cesare Pautasso (20)

Beautiful APIs - SOSE2021 Keynote
Beautiful APIs - SOSE2021 KeynoteBeautiful APIs - SOSE2021 Keynote
Beautiful APIs - SOSE2021 Keynote
 
How do you back up and consistently recover your microservice architecture?
How do you back up and consistently recover your microservice architecture?How do you back up and consistently recover your microservice architecture?
How do you back up and consistently recover your microservice architecture?
 
Microservices: An Eventually Inconsistent Architectural Style?
Microservices: An Eventually Inconsistent Architectural Style?Microservices: An Eventually Inconsistent Architectural Style?
Microservices: An Eventually Inconsistent Architectural Style?
 
Disaster Recovery and Microservices: The BAC Theorem
Disaster Recovery and Microservices: The BAC TheoremDisaster Recovery and Microservices: The BAC Theorem
Disaster Recovery and Microservices: The BAC Theorem
 
The Blockchain as a Software Connector
The Blockchain as a Software ConnectorThe Blockchain as a Software Connector
The Blockchain as a Software Connector
 
Team Situational Awareness and Architectural Decision Making with the Softwar...
Team Situational Awareness and Architectural Decision Making with the Softwar...Team Situational Awareness and Architectural Decision Making with the Softwar...
Team Situational Awareness and Architectural Decision Making with the Softwar...
 
JOpera - Eclipse-based Visual Composition Environment featuring a general lan...
JOpera - Eclipse-based Visual Composition Environment featuring a general lan...JOpera - Eclipse-based Visual Composition Environment featuring a general lan...
JOpera - Eclipse-based Visual Composition Environment featuring a general lan...
 
Push-Enabling RESTful Business Processes
Push-Enabling RESTful Business ProcessesPush-Enabling RESTful Business Processes
Push-Enabling RESTful Business Processes
 
BPMN for REST
BPMN for RESTBPMN for REST
BPMN for REST
 
SOA with REST
SOA with RESTSOA with REST
SOA with REST
 
Atomic Transactions for the REST of us
Atomic Transactions for the REST of usAtomic Transactions for the REST of us
Atomic Transactions for the REST of us
 
Service Oriented Architectures and Web Services
Service Oriented Architectures and Web ServicesService Oriented Architectures and Web Services
Service Oriented Architectures and Web Services
 
Exploiting Multicores to Optimize Business Process Execution
Exploiting Multicores to Optimize Business Process ExecutionExploiting Multicores to Optimize Business Process Execution
Exploiting Multicores to Optimize Business Process Execution
 
Real-time Mashups di Web Service Geografici
Real-time Mashups di Web Service GeograficiReal-time Mashups di Web Service Geografici
Real-time Mashups di Web Service Geografici
 
Towards Scalable Service Composition on Multicores
Towards Scalable Service Composition on MulticoresTowards Scalable Service Composition on Multicores
Towards Scalable Service Composition on Multicores
 
BPM with REST
BPM with RESTBPM with REST
BPM with REST
 
WS-* vs. RESTful Services
WS-* vs. RESTful ServicesWS-* vs. RESTful Services
WS-* vs. RESTful Services
 
RESTful Service Composition with JOpera
RESTful Service Composition with JOperaRESTful Service Composition with JOpera
RESTful Service Composition with JOpera
 
SOA2010 SOA with REST
SOA2010 SOA with RESTSOA2010 SOA with REST
SOA2010 SOA with REST
 
USI SCUBE Associate Member
USI SCUBE Associate MemberUSI SCUBE Associate Member
USI SCUBE Associate Member
 

Recently uploaded

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 

Recently uploaded (20)

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 

Scientific and Grid Workflow Management (SGS09)

  • 1. Scientific and Grid Workflow Management Cesare Pautasso University of Lugano http://www.pautasso.info 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 1
  • 2. Abstract Grid workflow management systems coordinate multiple job submissions over heterogeneous Grid resources. They feature visual programming environments to give scientist a high-level view over distributed computations composed of Grid services. This brief introduction to the field of scientific and Grid workflows includes a survey of selected workflow management tools and outlines current research trends. Swiss Grid School SGS‘09 @ GPC’09, Geneva, Switzerland 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 2
  • 3. Cesare Pautasso Ph.D. at ETH Zürich (2004) Post-Doc at ETH Zürich in the Systems (IKS) Group • Software: JOpera: Process Support for more than Web services http://www.jopera.org/ Researcher at IBM Zurich Research Lab (2007) Assistant Professor at the new Faculty of Informatics, University of Lugano (USI), Switzerland (since September 2007) • USI Representative in the SwiNG Assembly (since 2007) • Grid Workflow Working Group Lead (since 2007) More Information: http://www.pautasso.info/ Follow me on: http://twitter.com/pautasso/ 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 3
  • 4. Acknowledgements Some material contained in this tutorial was adapted from slides originally published by: Gustavo Alonso Win Bausch Ewa Deelman Ian Foster Yolanda Gil Carole Goble Roy Grønmo Thomas Heinis Rajesh Kalyanam Francesco Lelli Omer F. Rana Heiko Schuldt Frank Terpstra 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 4
  • 5. Outline Why Workflow Management on the Grid? Discussion: Scientific vs. Grid vs. Business Workflows • Some Application Examples Workflow Modeling Languages and Tools Overview • Grid Workflow Language Patterns Running Workflows on the Grid • JOpera: Scientific Workflow for Eclipse • Workflows and Provenance 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 5
  • 6. Why Workflow Management on the Grid? 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 6
  • 7. Kinds of Grid Computation One Job Submission Parameter Sweep 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 7
  • 8. Variablility and error Condition A assessment Significance assessment Determine parameters for likelihood for differential Annotate and error model expression for each normalize data gene Data MicroArray Preprocessing wet lab Scanner processing Clustering Extract raw spot intensities Determine 1 spot = 1 gene Expression Expression level: Image processing Pattern Condition B Green: A > B Red: A < B Black: A = B Cell 8.5.2009 population Scientific and Grid Workflow (Cesare Pautasso) 8
  • 9. Variablility Condition A in vitro in silico and error assessment Significance assessment Determine parameters for likelihood for differential Annotate and error model expression for each normalize data gene Data MicroArray Preprocessing wet lab Scanner processing Clustering Extract raw spot intensities Determine 1 spot = 1 gene Expression Expression level: Image processing Pattern Condition B Green: A > B Red: A < B Black: A = B Cell 8.5.2009 population Scientific and Grid Workflow (Cesare Pautasso) 9
  • 10. 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 10
  • 11. 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 11
  • 12. Vision for Scientific and Grid Workflows Make it easy to build Grid applications composed of multiple jobs “ Provide the scientist with a platform that takes care of all data handling and record keeping chores so that the user can concentrate on the science and not computer science 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) ” 12
  • 13. Workflow GRID 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 13
  • 14. Some (Scientific) Workflow Management Systems Askalon GWFE Pegasus Triana Bigbross Bossa GWES Pipeline Pilot Trident BioPipe ICENI P-GRADE Twister BPMN Inforsense PowerFolder Ultimus Breeze JIGSA Ptolemy II Versata Carnot JOpera Savvion Viztrails Con:cern Kepler Seebeyond wftk DAGMan Karajan SCIRun XFlow DiscoveryNet Oakgrove's ScyFLOW YAWL Dralasoft reactor SDSC Matrix Wildfire GEL OSIRIS SHOP2 WFEE GridAnt OSWorkflow Taverna WS-BPEL Grid Job Handler OpenWFE Teuta (UML) ZBuilder 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 14
  • 15. Scientific vs. Grid vs. Business Workflows 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 15
  • 16. The Origins: Business Process Management who has to do what, when 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 16
  • 17. The Origins: Business Process Management • A business process describes key procedures within an organization. They involve: • multiple steps • numerous people • large amounts of resources • In large business organizations there are many factors that increase the complexity of the business processes: • processes are not well documented • conformance to rules not guaranteed • people lack information about context • company lacks monitoring tools • steps, people and resources are not properly coordinated • Workflow Management Systems try to address these problems by automating the coordination aspects of a business process: who has to do what, when, and with which software tools. 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 17
  • 18. Business Workflows “ The automation of a business process where documents, information to be processed or tasks to be carried out are passed from one participant to another following a set of procedural rules Worfklow Management Coalition (WfMC, 1993) ” 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 18
  • 19. Scientific Workflows are networks of analytical “ steps that may involve, e.g., database access and querying, data analysis and mining, and many other steps including computationally intensive jobs submitted to high performance clusters and Grids ” Bertram Ludäscher 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 19
  • 20. Modeling Workflows activity 4 0 activity 1 2 1 User Software User Software Tool Tool Control Flow 3 4 Activity Activity 5 Data Flow Data Data 6 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 20
  • 21. Business Workflows activity 4 0 activity 1 2 1 User User Software Tool Control Flow 3 4 Activity Activity 5 6 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 21
  • 22. Scientific Workflows activity 4 0 activity 1 2 1 Software Software Tool Tool Control Flow 3 4 Activity Activity 5 Data Flow Data Data 6 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 22
  • 23. Similarities: Are scientists doing e-Business? Capturing knowledge/best practices Capture business processes within a company Capture scientific experiments Executable Models for Repeated Execution Run a well defined procedure many times Ensure that an experiment can be reproduced Incorporate human decision in the process Can we always do straight-through processing? Hard to achieve full automation 23
  • 24. Differences: Do scientists need business transactions? Rate of change Changing business procedures requires management approval Exploratory scientific processes require high flexibility Which kind of data? Travel reservations, Loan applications Large protein sequence databases, Astronomy image catalogs What is the ultimate goal? Making profit Making science 24
  • 25. Scientific vs. Grid Workflows Scientific workflows Grid workflows focus on the emphasize the design of large-scale execution of virtual experiments: scientific workflows: • Data flow models • Mapping and adaptation to a • Reusable “scientific computing” dynamic run-time environment component library • Provide access to shared • Interactive debugging, workflows as a Grid service monitoring and steering • Parameterized Execution • Data provenance and lineage • Centralized vs. Distributed tracking for reproducibility Execution Architectures • Model versioning for • Fault Tolerance exploratory customization • Optimization 25
  • 26. Scientific Workflows on the Grid • How can Scientific WF benefit from the Grid? 1. Leverage underlying Grid middleware: • Resource Management • Job Scheduling • Large Data Transfers (GridFTP) between Activities 2. Improved QoS based on the workflow model • Grid resource reservation • Data replication • Data placement • Fault Tolerance 26
  • 27. http://www.jopera.org/ Example 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 27
  • 28. A Web Service-Enabled Workflow System for Climate Modeling Data Processing in TeraGrid Rajesh Kalyanam Lan Zhao Taezoon Park Sebastien Goasguen 28
  • 29. Architecture Application Desktop Web Poral Application Workflow Engine JOpera Workflow Components Data Components Computation Components Metadata Data Data Data Globus Models / Condor Job Query Discovery Transfer Transformation Job Tools Purdue Data Management System Remote Data Proxies Computation Middleware From Rajesh Kalyanam OPeNDAP THREDDS SRB/MCAT Data Proxy Condor-G Globus Local Datasets Remote Datasets Computation Resources Climate Remote LARS PTO NWS Local Clusters TeraGrid Modeling Datasets 29
  • 30. Portal From Rajesh Kalyanam 30
  • 31. Workflow Model From Rajesh Kalyanam 31
  • 32. Workflow Execution From Rajesh Kalyanam 32
  • 33. Workflow Results From Rajesh Kalyanam 33
  • 34. Workflow Lifecycle 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 34
  • 35. Workflow Lifecycle Modeling Workflow Execution Scientific Workflow Workflow Computation Model Instances From Gustavo Alonso Simulation Log Analysis 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 35
  • 36. Workflow Modeling Methodologies 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 36
  • 37. Bottom up Composition 4. Share and Publish it as Web Service 3. Run, Test, and Debug the execution within the same modeling environment 2. Build a workflow using a drag, drop and connect modeling environment 1. Select components from a library a. Lookup services in a public registry b. Import from external Web service (WSDL) c. Search the standard library 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 37
  • 38. Top down Decomposition 1. Define a goal and Draw a skeleton of the workflow that satisfies it 2. Refine it and Bind services into it: • Search for existing matching services • Build missing services (if necessary) • Add required data transformations 3. Run, Test, and Debug the execution within the same modeling environment 4. Share and Publish it as Web Service 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 38
  • 39. Iterative Composition Change, Rediscover Build New services rvices Refactor er se cov Dis Model Manage Service Composition Deploy Run, Test Check Compile 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 39
  • 40. Workflow Modeling Languages and Tools Overview 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 40
  • 41. HeNCE - The Ancestor of Grid Workflows? A. Beguelin, J. J. Dongarra, G. A. Geist, R. Manchek, V. S. Sunderam, Graphical Development Tools for Network-Based Concurrent Supercomputing, in: Proc. of the 1991 ACM/IEEE conference on Supercomputing, Albuquerque, New Mexico, 1991, pp. 435–444. 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 41
  • 42. request : Inbox <<transformation>> <<transformation>> symbol = Symbol; country1 := 'USA'; country2 = Currency; symbol : getQuoteRequest1 rateRequest : getRateRequest <<WebServiceCall>> <<WebServiceCall>> getQuote getRate {service=net.xmethods.services.stockquote.StockQuoteService, {wsdl=http://www.xmethods.net/sd/2001/CurrencyExchangeService.wsdl, operation=getQuote, service=CurrencyExchangeService, wsdl=http://services.xmethods.net/soap/urn:xmethods-delayed-quotes.wsdl, portType=CurrencyExchangePortType, portType=net.xmethods.services.stockquote.StockQuotePortType} operation=getRate} originalPrice : getQuoteResponse1 conversionRate : getRateResponse <<transformation>> <<transformation>> a = Result; b = Result; Extended UML twoNumbers : MultiplyInbox Activity Diagrams <<ImmediateStep>> From Roy Gronmo product : MultiplyOutbox Multiply <<transformation>> <<transformation>> OriginalPrice = Result; ConvertedPrice = c; response : Outbox
  • 43. ExampleStockQuoteConvert Input JOpera Data Flow usa country symbol Graph country1 country2 symbol ? getRate getQuote Result b a Result DataIntegration c quote ExampleStockQuoteConvert Output
  • 44. Triana 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 44
  • 45. Taverna 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 45
  • 46. VizTrails 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 46
  • 48. OSIRIS 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 48
  • 49. JOpera 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 49
  • 50. Grid Workflow Language Patterns 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 50
  • 51. Workflow Pattern Variants Implicit 1. Simple Parallelism Explicit Static 2. Data Parallelism Dynamic Best Effort Blocking Hybrid 3. Pipelining Buffered Synchronized Superscalar Out of Order Streaming 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 51
  • 52. Modeling Simple Parallelism Data Flow, Graph Based SCIRun Kepler Triana 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 52
  • 53. Modeling Simple Parallelism Control Flow, Graph Based JOpera GEL UML 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 53
  • 54. Modeling Simple Parallelism Control Flow, Block Based BPMN WS-BPEL 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 54
  • 55. Modeling Data Parallelism Data Flow, Graph Rewriting Static or Dynamic Triana Taverna JOpera 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 55
  • 56. Modeling Data Parallelism Control Flow, Block Based, Dynamic WS-BPEL AGWL Karajan GEL 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 56
  • 57. Modeling Pipelined Execution 1 2 1, 2, 3, … 3 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 57
  • 58. Pipelining Semantics 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 58
  • 59. Best Effort Pipelined Execution Drop data elements on pipeline collisions Advantages: • Simplified implementation • Some applications may tolerate data loss Problem: • Downsampling is non deterministic 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 59
  • 60. Blocking Pipelined Execution Tasks are blocked if successors are busy Advantages: • Avoid data loss in the pipeline Problem: • Pipeline speed limited by slowest task • Data may be lost before it enters the pipeline 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 60
  • 61. Buffered Pipelined Execution Tasks are decoupled by buffers Advantages: • Collisions are prevented • Best applied to tasks having variable speed Problem: • Buffer capacity is limited (Blocking still needed – Hybrid semantics) 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 61
  • 62. Streaming Pipelined Execution Tasks exchange data while running Advantages: • Suitable for a distributed (P2P) engine Problems: • Shifts complexity from the workflow engine to the tasks • Tasks exchange data while running • Workflow/Task interface more complex 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 62
  • 63. Running Workflows on the Grid 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 63
  • 64. Workflow Model Basic Architecture Act 2 Act 4 Act 1 Act 5 Act 7 Act 3 Act 6 Workflow Management System Adapters Grid Schedulers Grid Resources 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 64
  • 65. Workflow Workflow Workflow Model Users Participants Act 2 Act 4 Act 1 Act 5 Act 7 Act 3 Act 6 Workflow Management System Workflow Modelers Adapters Grid Schedulers Wrapper Scientific Developers Grid Software Resources Developers 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 65
  • 66. Standard APIs Process Definition Tools Interface 1 Workflow API and Interchange formats Interface 4 Interface 5 Workflow Enactment Service Other Workflow Administration Enactment Service(s) & Monitoring Tools Workflow Workflow Engine(s) Engine(s) Workflow Reference Model, 1998 Interface 2 Interface 3 Workflow Invoked From WFMC, Client Applications Applications 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 66
  • 67. Wrappers and Grid Applications Worklist Act X Workflow Management System Handler adapter Application wrapper Grid X Scheduler Application X Application Application X X 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 67
  • 68. Wrappers and Legacy Applications • The workflow engine is also in charge of connecting the different scientific applications. • These applications do not have to talk directly to each other, they do it through the workflow engine. • Most engines target a service oriented applications for which they provide very good connectivity through standardized protocols. Otherwise, the interface adapters must be developed on a case by case basis (as a last resort manual integration may be required!) • For legacy application, a wrapper must be built so that the workflow engine can communicate with the application. The wrapper can be a simple relay of commands and data, or a complete translation program implementing functionality not present in the legacy application. • For most Grid applications, the interaction takes place through a Grid scheduler, which is responsible for managing the distributed execution of the applications. 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 68
  • 69. Run-time Abstraction Levels 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 69
  • 70. Run-time Abstraction Levels • A design-time workflow model needs to be mapped across different abstraction levels in order to be executed at run time. • User request the execution of a new workflow instance. • The abstract workflow is mapped to an executable instance by: • Finding suitable service implementations and binding them to the tasks • Rewriting the workflow graph based on a set of refinement rules • Planning required data staging, registration, placement, replication and transfer operations • Each task of the resulting executable workflow is then submitted to a Grid resource manager so that it can be scheduled on suitable resources • The mapping can be done: • when the workflow is started at instantiation time (statically) • incrementally as the workflow runs (adaptive execution with dynamic late binding) 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 70
  • 71. Example: Binding with WS-BPEL tion ti me set of services co mp o s i (BPEL partner link type) one service deployment time (WSDL port type) Activity service end point (port) invocation time 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 71
  • 72. Workflow Binding Lifecycle Library Registration time (classification) Modeling time (static early binding) Compilation time (blacklisting) Deployment time (customization) Startup time (testing) Task Execution time (dynamic late binding) Failed invocation time (rebind on retry) 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 72
  • 73. http://www.jopera.org/ JOpera Scientific Workflow for Eclipse 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 73
  • 74. High Level Workflow Language  Data and Control Aspects (Visual Representation)  Recursion, Iteration, Parallelism and Pipelining  Open and Extensible Component Model  Run existing code without changes  Synchronous, Asynchronous, and Streaming interaction  Web services support (Axis, WSIF)  Secure access to remote file systems and hosts (SSH)  Easy to integrate with existing schedulers (e.g. Condor) 6 February 2009 ©2009 Cesare Pautasso | www.pautasso.info 74
  • 75. High Level Workflow Language  Data and Control Aspects (Visual Representation)  Recursion, Iteration, Parallelism and Pipelining  Open and Extensible Component Model  Run existing code without changes  Synchronous, Asynchronous, and Streaming interaction  Web services support (Axis, WSIF)  Secure access to remote file systems and hosts (SSH)  Easy to integrate with existing schedulers (e.g. Condor)  Strong Eclipse Foundation  Platform Independent (Eclipse/Java)  Flexible, Extensible, Modular and Embeddable 6 February 2009 ©2009 Cesare Pautasso | www.pautasso.info 75
  • 76. JOpera Visual Composition Language Workflows are modeled using multiple viewpoints: 1. Data Flow Graph 2. Control Flow Graph 3. Service Bindings isbn QueryBookPrice QueryBookPrice QueryBookPrice e c r P - s b o J e c r P - s b o J e c r P - s b o J Amazon.com price t s u q e R o u i t a n v j e R CurrencyConvert CurrencyConvert amount XE.com z i o l n a t U z i o l n a t U CurrencyConvert amount Exception Handler 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 76
  • 77. JOpera Example: Doodle Map Mashup Setup a Doodle with Yahoo! Local search and visualize the results of the poll on Google Maps 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 77
  • 78. Doodle Map Mashup Architecture Web Browser Workflow RESTful Engine Web Services RESTful API APIs GET POST GET 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 78
  • 79. 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 79
  • 80. Extensible JOpera Component Model Combine in the same workflow jobs implemented using an open and extensible set of technologies JOpera Workflow WSDL Java Human XML SQL SSH Condor •Snippets •XSLT •Methods •XPath 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 80
  • 81. Sharing Workflows as a Service JOpera processes are automatically published to clients using a variety of access protocols Eclipse RCP Web Clients WS Clients Clients REST WSDL Java JOpera Workflow WSDL Java Human XML SQL SSH Condor 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 81
  • 82. JOpera ARC Integration Demo 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 82
  • 83. Workflows and Provenance 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 83
  • 84. Lineage in Scientific Workflows Scientists consider the “capture and generation of provenance information as a critical part of the workflow-generated data” “Sharing workflows is an essential element of education, and acceleration of knowledge dissemination.” Ewa Deelman et al. 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 84
  • 85. Where does this picture come from? METADATA This photo was taken July 21, 1981, when the Voyager 2 spacecraft was 33.9 million km from the Saturn planet 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 85
  • 86. Is this the right metadata? METADATA Date: 16.4.2005 Dimension: 640x480 Colors: 32bits Size: 1.2MB Format: JPEG Camera Model: D100 Equipment Make: Nikon Flash Used: No Focal Length: 8.2 mm FOR F-Number: F/2.6 Title: White Arabian Horse SALE Exposure: 1/100 s Metering: pattern GPS: 50.2 Lat. 60.4 Lon. 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 86
  • 87. Would you buy a horse without this? 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 87
  • 88. Lineage in Spreadsheets 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 88
  • 89. Lineage in Spreadsheets 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 89
  • 90. Lineage in Spreadsheets 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 90
  • 91. Lineage in Spreadsheets 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 91
  • 92. Lineage in Databases What is the relationship between these tuples? SQL Problem: Query Inversion 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 92
  • 93. Lineage in Software Development What’s in a Makefile? CC = gcc CFLAGS = -Wall -g program: main.o input.o output.o logic.o $(CC) $(CFLAGS) main.o input.o output.o logic.o -o program main.o: main.c input.h output.h logic.h $(CC) $(CFLAGS) -c main.c input.o: input.c input.h $(CC) $(CFLAGS) -c input.c output.o: output.c output.h $(CC) $(CFLAGS) -c output.c logic.o: logic.c logic.h $(CC) $(CFLAGS) -c logic.c 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 93
  • 94. Lineage in Software Development Where does my program come from? CC = gcc CFLAGS = -Wall -g program: main.o input.o output.o logic.o $(CC) $(CFLAGS) main.o input.o output.o logic.o -o program main.o: main.c input.h output.h logic.h $(CC) $(CFLAGS) -c main.c input.o: input.c input.h $(CC) $(CFLAGS) -c input.c output.o: output.c output.h $(CC) $(CFLAGS) -c output.c logic.o: logic.c logic.h $(CC) $(CFLAGS) -c logic.c 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 94
  • 95. Lineage in Scientific Workflows Input Theory An ideal scientific Data workflow should document all of Output Published the steps linking Data Paper the original observations with Scientific Workflow the final published results so that the Input Observation process can be Data reproduced 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 95
  • 96. Data Provenance Where does this output document come from? ? 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 96
  • 97. Change Propagation What to recompute if this input changes? 8.5.2009 ! Scientific and Grid Workflow (Cesare Pautasso) 97
  • 98. Conclusion 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 98
  • 99. Workflow and Modeling Reuse Component Libraries Data, Data Products Metadata Catalogs Populate Adapt, Workflow with data Modify Template Workflow Data, Metadata, Instance Provenance Information Executable Map to Execute available Resource , Execution Workflow resources Application From Ewa Deelman Component Compute, Descriptions Storage Distributed and Network Resources Mapping 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 99
  • 100. Executed e-Science as Workflow? Executing Executable Provenance Query Not yet executable What I What I Did Am Doing Model What I … Want to Do From Ian Foster Execution environment Schedule 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 100
  • 101. Some References Gil, Y. et al.; Examining the Challenges of Scientific Workflows. IEEE Computer, Dec 2007 Taylor, I.J.; Deelman, E.; Gannon, D.B.; Shields, M. (Eds.) Workflows for e-Science: Scientific Workflows for Grids, Springer 2007 Yu, J.; Buyya, R.: A taxonomy of workflow management systems for grid computing, Journal of Grid Computing, 3(3–4):171–200 (2005) Pautasso, C.; Alonso, G.: Parallel Computing Patterns for Grid Workflows, Proc. Of WORKS@HPDC06, Paris, France, 2006 OGF Workflow Research Group http://www.isi.edu/~deelman/wf–rg/ Download This Tutorial Material http://www.pautasso.info/lectures/sgs09workflow.pdf 8.5.2009 Scientific and Grid Workflow (Cesare Pautasso) 101
  • 102. Free Download http://www.jopera.org/ 6 February 2009 ©2009 Cesare Pautasso | www.pautasso.info 102