Service Oriented Architecture
     and Globus Toolkit

         Ravi K Madduri
   Argonne National Laboratory
      University of Chicago
Agenda
q   Principles of Service Oriented Architecture
q   The Globus Toolkit
q   Web Services Basics
q   Grid Services
q   What people punt on ?
    x   Intro to Globus Security, Service Registries
q   Workflows we created
q   Lessons learned
Principles of Service Oriented
                  Architecture
q   Guiding principles define the ground rules for
    development, maintenance, and usage of the
    SOA
    x   Reuse, granularity, modularity, composability,
        componentization and interoperability
    x   Standards compliance (both common and
        industry-specific)
    x   Services identification and categorization,
        provisioning and delivery, and monitoring and
        tracking
Architectural Principles
q   Service encapsulation – Many web services are
    consolidated to be used under the SOA.
q   Service loose coupling – Services maintain a
    relationship that minimizes dependencies and only
    requires that they maintain an awareness of each
    other
q   Service contract – Services adhere to a
    communications agreement, as defined collectively
    by one or more service description documents
q   Service abstraction – Beyond what is described in
    the service contract, services hide logic from the
    outside world
Architectural Principles
q   Service reusability – Logic is divided into
    services with the intention of promoting
    reuse
q   Service composability – Collections of
    services can be coordinated and assembled
    to form composite services
q   Service autonomy – Services have control
    over the logic they encapsulate
Architectural Principles
q   Service optimization – All else equal, high-
    quality services are generally considered
    preferable to low-quality ones
q   Service Discoverability - Services are
    designed to be outwardly descriptive so
    that they can be found and assessed via
    available discovery mechanisms
q   Service Relevance – Functionality is
    presented at a granularity recognized by
    the user as a meaningful service
Globus Software: dev.globus.org
      Globus Projects
                                                 OGSA-DAI    GT4

    MPICH-
      G2         Java                              Data      Replica
                          Delegation   MyProxy
                Runtime                            Rep      Location

   GridWay        C                      GSI-
                             CAS                 GridFTP     MDS4
                Runtime                OpenSSH

   Incubator                                     Reliable
     Mgmt       Python
                            C Sec       GRAM       File     GT4 Docs
                Runtime
                                                 Transfer




Common               Execution                     Info
          Security                  Data Mgmt                  Other
Runtime                Mgmt                      Services
Web Service Basics
      q   Web Services are basic distributed
          computing technology that let us construct
          client-server interactions




Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch01s02.html
Web Service Basics 2
q   Web services are platform independent
    and language independent
    x   Client and server program can be written in
        diff langs, run in diff envt’s and still interact
q   Web services describe themselves
    x   Once located you can ask it how to use it
q   Web services are ideal for loosely coupled
    systems
    x   Unlike CORBA, EJB, etc.
WSDL: Web Services
                   Description Language


                                     Define expected messages for a service,
                                     and their (input or output parameters)


                                     An interface groups together a number of
                                     messages (operations)




Bind an Interface via a definition
to a specific transport (e.g.              The network location where the service is
HTTP) and messaging (e.g.                  implemented , e.g. http://localhost:8080
SOAP) protocol
Real Web Service Invocation

  Discover




  Describe




    Invoke




Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch01s02.html
Web Services Server Applications
                                     q   Web service – software that
                                         exposes a set of operations
                                     q   SOAP Engine – handle SOAP
                                         requests and responses
                                         (Apache Axis)
                                     q   Application Server – provides
                                                               Container
                                         “living space” for applications
                                         that must be accessed by
                                         different clients (Tomcat)
                                     q   HTTP server- also called a
                                         Web server, handles http
                                         messages
Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch01s02.html
Let’s talk about state
      q   Plain Web services are stateless




Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch01s03.html
However, Many Grid
                       Applications Require State




Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch01s03.html
Keep the Web Service
                    and the State Separate
   q   Instead of putting state in a Web
       service, we keep it in a resource
   q   Each resource has a unique key




Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch01s03.html
Resources Can Be Anything Stored

                   Web Service
                       +
                    Resource
                       =
                   WS-Resource


                   Address of a WS-
                   resource is called
                   an end-point
                   reference
Web Services So Far
q   Basic client-server interactions
q   Stateless, but with associated resources
q   Self describing using WSDL


q   But we’d really like is a common way to
    x   Name and do bindings
    x   Start and end services
    x   Query, subscription, and notification
    x   Share error messages
Standard Interfaces
                              q   Service information
                              q   State representation
                                  x   Resource
            GetRP
                                  x   Resource Property
          GetMultRPs
                              q   State identification
            SetRP                 x   Endpoint Reference
Client               Web
           QueryRPs
                    Service
                              q   State Interfaces
           Subscribe              x   GetRP, QueryRPs,
                                      GetMultipleRPs, SetRP
           SetTerm
            Time              q   Lifetime Interfaces
            Destroy               x   SetTerminationTime
                                  x   ImmediateDestruction
                              q   Notification Interfaces
                                  x   Subscribe
                                  x   Notify
                              q   ServiceGroups
WSRF & WS-Notification

q   Naming and bindings (basis for virtualization)
    x   Every resource can be uniquely referenced, and has one or
        more associated services for interacting with it
q   Lifecycle (basis for fault resilient state management)
    x   Resources created by services following factory pattern
    x   Resources destroyed immediately or scheduled
q   Information model (basis for monitoring & discovery)
    x   Resource properties associated with resources
    x   Operations for querying and setting this info
    x   Asynchronous notification of changes to properties
q   Service Groups (basis for registries & collective svcs)
    x   Group membership rules & membership management
q   Base Fault type
WSRF vs XML/SOAP
q   The definition of WSRF means that the
    Grid and Web services communities can
    move forward on a common base
q   Why Not Just Use XML/SOAP?
    x   WSRF and WS-N are just XML and SOAP
    x   WSRF and WS-N are just Web services
q   Benefits of following the specs:
    x   These patterns represent best practices that
        have been learned in many Grid
        applications
    x   There is a community behind them
    x   Why reinvent the wheel?
    x   Standards facilitate interoperability
WS Core Enables Frameworks:
    E.g., Resource Management
                 Applications of the framework
           (Compute, network, storage provisioning,
       job reservation & submission, data management,
                  application service QoS, …)


       WS-Agreement                WS Distributed Management
   (Agreement negotiation)          (Lifecycle, monitoring, …)


       WS-Resource Framework & WS-Notification (*)
    (Resource identity, lifetime, inspection, subscription, …)


                     Web services
    (WSDL, SOAP, WS-Security, WS-ReliableMessaging, …)


* An evolution of Open Grid Services Infrastructure (OGSI)
Globus and Web Services
                                       User Applications




                                                   Globus
       (e.g., Apache Axis)
        Globus Container




                                                                 and Admin
                                                 WSRF Web




                                                                  Registry
                                                  Services


                                   WS-A, WSRF, WS-Notification


                                   WSDL, SOAP, WS-Security



Globus Core: Java , C (fast, small footprint), Python
Globus and Web Services
                                             User Applications




                                        Custom           Globus
       (e.g., Apache Axis)
        Globus Container




                                                                      and Admin
                                                       WSRF Web




                                                                       Registry
                             Custom      WSRF
                              Web       Services        Services
                             Services

                                        WS-A, WSRF, WS-Notification


                                        WSDL, SOAP, WS-Security



Globus Core: Java , C (fast, small footprint), Python
Globus Security
q   Extensible authorization framework
    based on Web services standards
    x   SAML-based authorization callout
        q   Security Assertion Markup Language, OASIS
            standard
        q   Used for Web Browers authentication often
        q   Very short-lived bearer credentials
    x   Integrated policy decision engine
        q   XACML (eXtensible Access Control Markup
            Language) policy language, per-operation
            policies, pluggable
Delegation Service
   q   Higher level                        Hosting Environment

       service                  Service1
   q   Authentication           Service2
                                                       Resources

       protocol                            EPR   Delegation Service
       independent              Service3

                                                 Delegate    Refresh
   q   Refresh
       interface                                                   Refresh
   q   Delegate once,                            EPR
                                                        Delegate
       share across
       services and
                                                        Client
       invocation
Rachana Ananthakrishnan
Delegation

      q   Secure Conversation
          x   Can delegate as part of protocol
          x   Extra round trip with delegation
          x   Types: Full or Limited delegation
          x   Delegation Service is preferred way of
              delegating
      q   Secure Message and Secure Transport
          x   Cannot delegate as part of protocol



Rachana Ananthakrishnan
Globus’s Use of
Security Standards




Supported,   Supported,      Fastest,
 but slow    but insecure   so default
Monitoring and Discovery System
                     (MDS4)
q   Grid-level monitoring system
    x   Aid user/agent to identify host(s) on which to
        run an application
    x   Warn on errors
q   Uses standard interfaces to provide publishing
    of data, discovery, and data access, including
    subscription/notification
    x   WS-ResourceProperties, WS-BaseNotification,
        WS-ServiceGroup
q   Functions as an hourglass to provide a
    common interface to lower-level monitoring
    tools
Taverna

                                  A sample
                                  caGrid
                                  workflow




caGrid Scavenger with semantic/
metadata
based caGrid service query
Sample Workflow with caDSR
q   Scientific value                      Workflow
                                            input
    x   To find all the UML packages
        related to a given context
        (‘caCore’).                        caGrid
                                           services
    x   Not a real scientific
        experiment.
         q   Simple.                       “Shim”
         q   Important in caGrid.          services


q   Steps
    x   Querying Project object.
    x   Do data transformation.
    x   Querying Packages object          Workflow
                                           output
        and get the result.
Protein sequence information query

q   Scientific value
    x   To query protein sequence
        information out of 3 caGrid
        data services: caBIO, CPAS and
        GridPIR.
    x   To analyze a protein sequence
        from different data sources.
q   Steps
    x   Querying CPAS and get the id,
        name, value of the sequence.
    x   Querying caBIO and GridPIR
        using the id or name obtained
        from CPAS.
Microarray clustering*
q    Scientific value
      x   A common routine to group
          genes or experiments into
          clusters with similar profiles.
      x   To identify functional groups of
          genes.
q    Steps
      x   Querying and retrieving the
          microarray data of interest from
          a caArrayScrub data service at
          Columbia University
      x   Preprocessing, or normalize the
          microarray data using the
          GenePattern analytical service                                Workflow in/output
          at the Broad Institute at MIT
                                                                           caGrid services
      x   Running hierarchical clustering
          using the geWorkbench                                    others “Shim” services
          analytical service at Columbia
          University
    *Wei Tan, Ravi Madduri, Kiran Keshav, Baris E. Suzek, Scott
    Oster, Ian Foster. Orchestrating caGrid Services in Taverna.
    ICWS 08.
Execution                Execution result as
  trace                         xml




            1936 gene expressions
Lymphoma prediction type prediction

q   Scientific value                                                          *
    x   Using gene-expression patterns
        associated with DLBCL and FL to
        predict the lymphoma type of an
        unknown sample.
    x   Using SVM (Support Vector
        Machine) to classify data, and
        predicting the tumor types of
        unknown examples.
q   (Major) steps
    x   Querying training data from
        experiments stored in caArray.
    x   Preprocessing, or normalize the
        microarray data.
    x   Adding training and testing data
        into SVM service to get
        classification result.


*Fig. from MA Shipp. Diffuse large B-cell lymphoma outcome prediction by
gene-expression profiling and supervised machine learning. Nature medicine,
Querying



Preprocessing


Classifying & predicting
Lymphoma type prediction
q   Result snippet                      *Classification errors are
                                        highlighted.




Acknowledgement:
Juli Klemm, Xiaopeng Bian, Rashmi Srinivasa (NCI)
Jared Nedzel (MIT)
Lessons Learned
q   Service abstraction not applicable to
    everything
q   Virtual Organization concepts still good
q   Web services is one way to create service
    oriented architectures but not always the
    best way
q   Make implementation agnostic of tools
    underneath
q   True value in ability to create workflows
Service-Oriented Science
     q   People create services (data or functions) …
     q   which I discover (& decide whether to use) …
     q   & compose to create a new function ...
     q   & then publish as a new service.


     q    I find “someone else” to host services,
         so I don’t have to become an expert in operating
         services & computers!

!!   q    I hope that this “someone else” can
         manage security, reliability, scalability, …



            “Service-Oriented Science”, Science, 2005
Questions ?

Session18 Madduri

  • 1.
    Service Oriented Architecture and Globus Toolkit Ravi K Madduri Argonne National Laboratory University of Chicago
  • 2.
    Agenda q Principles of Service Oriented Architecture q The Globus Toolkit q Web Services Basics q Grid Services q What people punt on ? x Intro to Globus Security, Service Registries q Workflows we created q Lessons learned
  • 3.
    Principles of ServiceOriented Architecture q Guiding principles define the ground rules for development, maintenance, and usage of the SOA x Reuse, granularity, modularity, composability, componentization and interoperability x Standards compliance (both common and industry-specific) x Services identification and categorization, provisioning and delivery, and monitoring and tracking
  • 4.
    Architectural Principles q Service encapsulation – Many web services are consolidated to be used under the SOA. q Service loose coupling – Services maintain a relationship that minimizes dependencies and only requires that they maintain an awareness of each other q Service contract – Services adhere to a communications agreement, as defined collectively by one or more service description documents q Service abstraction – Beyond what is described in the service contract, services hide logic from the outside world
  • 5.
    Architectural Principles q Service reusability – Logic is divided into services with the intention of promoting reuse q Service composability – Collections of services can be coordinated and assembled to form composite services q Service autonomy – Services have control over the logic they encapsulate
  • 6.
    Architectural Principles q Service optimization – All else equal, high- quality services are generally considered preferable to low-quality ones q Service Discoverability - Services are designed to be outwardly descriptive so that they can be found and assessed via available discovery mechanisms q Service Relevance – Functionality is presented at a granularity recognized by the user as a meaningful service
  • 7.
    Globus Software: dev.globus.org Globus Projects OGSA-DAI GT4 MPICH- G2 Java Data Replica Delegation MyProxy Runtime Rep Location GridWay C GSI- CAS GridFTP MDS4 Runtime OpenSSH Incubator Reliable Mgmt Python C Sec GRAM File GT4 Docs Runtime Transfer Common Execution Info Security Data Mgmt Other Runtime Mgmt Services
  • 8.
    Web Service Basics q Web Services are basic distributed computing technology that let us construct client-server interactions Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch01s02.html
  • 9.
    Web Service Basics2 q Web services are platform independent and language independent x Client and server program can be written in diff langs, run in diff envt’s and still interact q Web services describe themselves x Once located you can ask it how to use it q Web services are ideal for loosely coupled systems x Unlike CORBA, EJB, etc.
  • 10.
    WSDL: Web Services Description Language Define expected messages for a service, and their (input or output parameters) An interface groups together a number of messages (operations) Bind an Interface via a definition to a specific transport (e.g. The network location where the service is HTTP) and messaging (e.g. implemented , e.g. http://localhost:8080 SOAP) protocol
  • 11.
    Real Web ServiceInvocation Discover Describe Invoke Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch01s02.html
  • 12.
    Web Services ServerApplications q Web service – software that exposes a set of operations q SOAP Engine – handle SOAP requests and responses (Apache Axis) q Application Server – provides Container “living space” for applications that must be accessed by different clients (Tomcat) q HTTP server- also called a Web server, handles http messages Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch01s02.html
  • 13.
    Let’s talk aboutstate q Plain Web services are stateless Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch01s03.html
  • 14.
    However, Many Grid Applications Require State Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch01s03.html
  • 15.
    Keep the WebService and the State Separate q Instead of putting state in a Web service, we keep it in a resource q Each resource has a unique key Borja Sotomayor , http://gdp.globus.org/gt4-tutorial/multiplehtml/ch01s03.html
  • 16.
    Resources Can BeAnything Stored Web Service + Resource = WS-Resource Address of a WS- resource is called an end-point reference
  • 17.
    Web Services SoFar q Basic client-server interactions q Stateless, but with associated resources q Self describing using WSDL q But we’d really like is a common way to x Name and do bindings x Start and end services x Query, subscription, and notification x Share error messages
  • 18.
    Standard Interfaces q Service information q State representation x Resource GetRP x Resource Property GetMultRPs q State identification SetRP x Endpoint Reference Client Web QueryRPs Service q State Interfaces Subscribe x GetRP, QueryRPs, GetMultipleRPs, SetRP SetTerm Time q Lifetime Interfaces Destroy x SetTerminationTime x ImmediateDestruction q Notification Interfaces x Subscribe x Notify q ServiceGroups
  • 19.
    WSRF & WS-Notification q Naming and bindings (basis for virtualization) x Every resource can be uniquely referenced, and has one or more associated services for interacting with it q Lifecycle (basis for fault resilient state management) x Resources created by services following factory pattern x Resources destroyed immediately or scheduled q Information model (basis for monitoring & discovery) x Resource properties associated with resources x Operations for querying and setting this info x Asynchronous notification of changes to properties q Service Groups (basis for registries & collective svcs) x Group membership rules & membership management q Base Fault type
  • 20.
    WSRF vs XML/SOAP q The definition of WSRF means that the Grid and Web services communities can move forward on a common base q Why Not Just Use XML/SOAP? x WSRF and WS-N are just XML and SOAP x WSRF and WS-N are just Web services q Benefits of following the specs: x These patterns represent best practices that have been learned in many Grid applications x There is a community behind them x Why reinvent the wheel? x Standards facilitate interoperability
  • 21.
    WS Core EnablesFrameworks: E.g., Resource Management Applications of the framework (Compute, network, storage provisioning, job reservation & submission, data management, application service QoS, …) WS-Agreement WS Distributed Management (Agreement negotiation) (Lifecycle, monitoring, …) WS-Resource Framework & WS-Notification (*) (Resource identity, lifetime, inspection, subscription, …) Web services (WSDL, SOAP, WS-Security, WS-ReliableMessaging, …) * An evolution of Open Grid Services Infrastructure (OGSI)
  • 22.
    Globus and WebServices User Applications Globus (e.g., Apache Axis) Globus Container and Admin WSRF Web Registry Services WS-A, WSRF, WS-Notification WSDL, SOAP, WS-Security Globus Core: Java , C (fast, small footprint), Python
  • 23.
    Globus and WebServices User Applications Custom Globus (e.g., Apache Axis) Globus Container and Admin WSRF Web Registry Custom WSRF Web Services Services Services WS-A, WSRF, WS-Notification WSDL, SOAP, WS-Security Globus Core: Java , C (fast, small footprint), Python
  • 24.
    Globus Security q Extensible authorization framework based on Web services standards x SAML-based authorization callout q Security Assertion Markup Language, OASIS standard q Used for Web Browers authentication often q Very short-lived bearer credentials x Integrated policy decision engine q XACML (eXtensible Access Control Markup Language) policy language, per-operation policies, pluggable
  • 25.
    Delegation Service q Higher level Hosting Environment service Service1 q Authentication Service2 Resources protocol EPR Delegation Service independent Service3 Delegate Refresh q Refresh interface Refresh q Delegate once, EPR Delegate share across services and Client invocation Rachana Ananthakrishnan
  • 26.
    Delegation q Secure Conversation x Can delegate as part of protocol x Extra round trip with delegation x Types: Full or Limited delegation x Delegation Service is preferred way of delegating q Secure Message and Secure Transport x Cannot delegate as part of protocol Rachana Ananthakrishnan
  • 27.
    Globus’s Use of SecurityStandards Supported, Supported, Fastest, but slow but insecure so default
  • 28.
    Monitoring and DiscoverySystem (MDS4) q Grid-level monitoring system x Aid user/agent to identify host(s) on which to run an application x Warn on errors q Uses standard interfaces to provide publishing of data, discovery, and data access, including subscription/notification x WS-ResourceProperties, WS-BaseNotification, WS-ServiceGroup q Functions as an hourglass to provide a common interface to lower-level monitoring tools
  • 29.
    Taverna A sample caGrid workflow caGrid Scavenger with semantic/ metadata based caGrid service query
  • 30.
    Sample Workflow withcaDSR q Scientific value Workflow input x To find all the UML packages related to a given context (‘caCore’). caGrid services x Not a real scientific experiment. q Simple. “Shim” q Important in caGrid. services q Steps x Querying Project object. x Do data transformation. x Querying Packages object Workflow output and get the result.
  • 31.
    Protein sequence informationquery q Scientific value x To query protein sequence information out of 3 caGrid data services: caBIO, CPAS and GridPIR. x To analyze a protein sequence from different data sources. q Steps x Querying CPAS and get the id, name, value of the sequence. x Querying caBIO and GridPIR using the id or name obtained from CPAS.
  • 32.
    Microarray clustering* q Scientific value x A common routine to group genes or experiments into clusters with similar profiles. x To identify functional groups of genes. q Steps x Querying and retrieving the microarray data of interest from a caArrayScrub data service at Columbia University x Preprocessing, or normalize the microarray data using the GenePattern analytical service Workflow in/output at the Broad Institute at MIT caGrid services x Running hierarchical clustering using the geWorkbench others “Shim” services analytical service at Columbia University *Wei Tan, Ravi Madduri, Kiran Keshav, Baris E. Suzek, Scott Oster, Ian Foster. Orchestrating caGrid Services in Taverna. ICWS 08.
  • 33.
    Execution Execution result as trace xml 1936 gene expressions
  • 34.
    Lymphoma prediction typeprediction q Scientific value * x Using gene-expression patterns associated with DLBCL and FL to predict the lymphoma type of an unknown sample. x Using SVM (Support Vector Machine) to classify data, and predicting the tumor types of unknown examples. q (Major) steps x Querying training data from experiments stored in caArray. x Preprocessing, or normalize the microarray data. x Adding training and testing data into SVM service to get classification result. *Fig. from MA Shipp. Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nature medicine,
  • 35.
  • 36.
    Lymphoma type prediction q Result snippet *Classification errors are highlighted. Acknowledgement: Juli Klemm, Xiaopeng Bian, Rashmi Srinivasa (NCI) Jared Nedzel (MIT)
  • 37.
    Lessons Learned q Service abstraction not applicable to everything q Virtual Organization concepts still good q Web services is one way to create service oriented architectures but not always the best way q Make implementation agnostic of tools underneath q True value in ability to create workflows
  • 38.
    Service-Oriented Science q People create services (data or functions) … q which I discover (& decide whether to use) … q & compose to create a new function ... q & then publish as a new service. q  I find “someone else” to host services, so I don’t have to become an expert in operating services & computers! !! q  I hope that this “someone else” can manage security, reliability, scalability, … “Service-Oriented Science”, Science, 2005
  • 39.