Ogce Workflow Suite
Upcoming SlideShare
Loading in...5
×
 

Ogce Workflow Suite

on

  • 1,509 views

 

Statistics

Views

Total Views
1,509
Views on SlideShare
1,503
Embed Views
6

Actions

Likes
0
Downloads
14
Comments
0

1 Embed 6

http://www.slideshare.net 6

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Scientific workflow requirements challenge WS-BPEL

Ogce Workflow Suite Ogce Workflow Suite Presentation Transcript

  • OGCE Workflow Toolkit for Multi-Scale Science Applications
    Suresh Marru
    Pervasive Technology Institute
    Indiana University
    ODI Gateway/Pipeline Evaluation
    NOAO
    Jan 21st 2010
  • TeraGrid
  • Data Capacitor
    • Provide temporary storage for the pipeline and for the science gateway, interface with archive, and data transfer from ODI and to compute resources
    • IU Bloomington data center
    • NSF funded project to create a high throughput 535 TB Lustre storage solution
    • Short term storage for applications with high bandwidth needs
    • IU added a wide area version in 2008
    • Enhanced wide area security
  • Data Capacitor Architecture
    Clients
    Metadata Servers
    Object Storage
    Object Storage
    Force-10 Switch
    Twenty-four object storage servers attached to shared storage
  • Massive Data Storage System
    MDSS Architecture
    • Deep
    • Wide
    • Open
  • Science Gateway
    A Science Gateway is a community-developed set of tools, applications, and data that is integrated via a portal or a suite of applications, usually in a graphical user interface, that is customized to meet the needs of the targeted community.
    More Info - http://www.teragrid.org/gateways/
  • Gateways Advantages
    Increase access
    To instruments
    Increase capabilities
    To analyze data
    Improve workforce development
    For underserved populations
    Increase outreach
    Increase public awareness
    Public sees value in investments in large facilities
  • There are approximately 30 gateways using the TeraGrid
  • Example Gateway Accomplishments
    LEAD - access to radar data
    NVO – access to sky surveys
    OOI – access to sensor data
    PolarGrid – access to polar ice sheet data
    SIDGrid – analysis tools
    GridChem – developing multiscale coupling
    How would this have been done before gateways?
    How many details do we want each individual scientist to need to know?
  • TeraGrid Advantages and Challenges
    What’s different when the resource doesn’t belong just to me?
    Resource discovery
    Accounting
    Security
    Proposal-based requests for resources (peer-reviewed access)
    Code scaling and performance numbers
    Justification of resources
    Gateway citations
    Tremendous benefits at the high end, but even more work for the developers
    Potential impact on science is huge
    Small number of developers can impact thousands of scientists
    But need a way to train and fund those developers and provide them with appropriate tools
  • Example Gateway: LEAD
  • Weather is Local, High-Impact, Heterogeneous and Rapidly Evolving…Yet Our Technologies and Thinking are Static
    Rain and
    Snow
    Fog
    Rain and
    Snow
    Snow and
    Freezing Rain
    Intense
    Turbulence
    Severe
    Thunderstorms
  • LEAD Dynamic Adaptive Infrastructure
    Storms Forming
    Forecast Model
    Streaming
    Observations
    Data Mining
    Instrument Steering
    Refine forecast grid
  • NSF Engineering Research Center for Collaborative Adaptive Sensing of the Atmosphere (CASA)
    UMass/Amherst, OU, CSU, UPRM
    Concept: inexpensive, phased array Doppler radars on cell towers and buildings
    Dynamically adaptive sensing of multiple targets while simultaneously meeting multiple end-user needs
  • LEAD Workflow Requirements
    Run jobs on-demand on TeraGrid.
    Deadline driven workflows (severe weather tracking)
    Users ranging from 8th grade students to seasoned researchers.
    Run jobs on Multiple TeraGrid resources to decrease turn-around time.
    Must be able to integrate to Portal with very user friendly web interface.
  • Workflow Survey in 2003(http://www.extreme.indiana.edu/swf-survey/)
  • High Level LEAD Architecture
    Workflow graph
    Application
    services
    Compute Engine
    User Portal
    Workflow
    Engine
    Fault
    Tolerance
    & scheduler
    Event Notification Bus
    Portal
    server
    MyLEAD
    Agent
    service
    Data
    Management
    Service
    Data
    Catalog
    service
    Providence
    Collection
    service
    MyLEAD User
    Metadata
    catalog
    Data Storage
  • LEAD Pioneering Technology
  • LEAD Scientists and Educational Interactions
    Lowering the barrier for using complex end-to-end weather technologies
    Democratize
    Empower
    Facilitate
    End Users
    Developers
    Researcherss
  • Layered Architecture
  • Open Grid Computing Environments
    Generalize, Harden, Build Test
  • Flexible Layered Service Oriented Architecture
    User Interactions
    Other Clients
    XBaya GUI
    Web Portal
    XBaya Core
    Event Bus
    Middleware Services
    GFac Services
    Workflow Engine (ODE)
    XRegistry
    XMCCat Metadata Catalog
    Compute & Data Resources
    Computational Cloud
    Local Lab Resources
    Computational Grids
  • OGCE Workflow Suite
    Generic Service Toolkit
    Tool to wrap command-line applications as web services
    Handles file staging & job submission and monitoring
    Extensible runtime for security, resource brokering & urgent computing
    Generic Factory service for on-demand creation of application services
    XRegistry
    Information repository for the OGCE workflow suite
    Register, search, retrieve & share XML documents
    User & hierarchical group based authorization
    XBaya
    GUI based tool to compose & monitor workflows
    Extensible support for compiler plug-ins like BPEL, Jython, SCUFL
    Dynamic Workflow Execution support to start, pause, resume, rewind of workflow executions
    Apache ODE Scientific Workflow Extensions
    XBaya GUI integration for BPEL Generation
    Asynchronous support for long running workflows
    Instrumented with fine grained monitoring
    Eventing System
    Supports both WS-Eventing and WS-Notification Standards
    Very scalable
    Persistent Message Box for clients behind firewalls and with intermittent network glitches.
  • Generic Application Service Factory
  • Application Wrapper Framework
    c
    Application Factory
    • Scientific Applications are wrapped into web services by filling in a web-form.
    • The Application Factory generates a web service for each application with I/O interfaces.
    • Registers WSDL for the service with a registry
    • Each service generates a stream of notifications that log the service actions back to the XMCCat Metadata Catalog, user monitoring, and provenance tracking tools
    App
    Service
    Run program
    & publish events
  • Service Monitoring via Events
    1
    2
    3
    4
    5
    6
    • The service output is a stream of events
    • I am running your request
    • I have started to move your input files
    • I have all the files
    • I am running your application
    • The application is finished
    • I am moving the output to you file space
    • I am done
    • These are automatically generated by the service using a distributed event system(WS-Eventing / WS-Notification)
    • Topic based pub-sub system witha well known “channel”
    Application
    Service
    Instance
    Notification
    Channel
    x
    x
    publisher
    listener
  • Workflow Composer
  • Interoperable XBaya Workflow Architecture
    BPEL 1.1
    BPEL 2.0
    SCUFL
    Abstract DAG Model
    Composition and Monitoring
    Python
    Dynamic Enactor/Interpreter
    Jython
    Based Enactor
    GPEL
    Engine
    Apache
    ODE Engine
    Taverna
    Python Runtime
    Message Bus
  • WS-BPEL
    Business Process Execution Language for Web Services (WS-BPEL)
    De-facto standard for specifying web service based business processes and service compositions
    Basic activities
    Invoke, Receive, Assign..
    Structured activities
    Sequence, Flow, ForEach,..
  • Workflow Composition, Execution & Monitoring
    XBaya enables users to construct, share, execute and monitor sequence of tasks executing on their local workstations to high-end compute resources.
  • GPEL
    Grid Process Execution Language
    BPEL4WS based home grown research workflow engine
    Supports a subset of BPEL4WS 1.1
    One of the very early adaptations of BPEL efforts
    Specifically designed for eScience Usage
    Long running workflow support
    Decoupled client
  • Benefits of Porting to Apache ODE
  • XML Metadata Catalog (XMC Cat)
    Taming Complex Scientific Metadata Schemas
    “A significant need exists in many disciplines for long-term, distributed, and stable data and metadata repositories”
    • NSF Blue-Ribbon Advisory Panel on Cyberinfrastructure
    “Metadata is key to being able to share results”
    • UK e-Science Core Programme Study
    More Info: Scott Jensen, Beth Plale
  • Simple Recovery Architecture
    Portal
    BPEL
    Workflow
    Engine
    Application
    Performance
    Models
    Fault Tolerance/
    Recovery Service
    Resource
    Reliability
    Models
    Application
    Service
    OVP/
    RST/
    MIG
    NWS, MDS
    BQP
    Notification
    Service
    Deadline &
    Success
    Probability
    36
    36
  • OGCE Workflow Usage Flow
    Scientist/Application provider registers application description with Registry Service.
    Workflow Author constructs the workflow with multiple wrapped application services.
    Workflow is compiled and deployed to the ODE workflow Engine.
    Workflow inputs are captured by XBaya and workflow is launched to ODE.
    Workflow system and possibly some services publish notifications to the Message bus reporting the progress of the workflow.
    XBaya monitoring system listens to notifications and color the workflow components to present workflow progress.
  • Packaged, Downloadable Software
    http://www.collab-ogce.org/ogce/index.php/Main_Page
  • ODI Overview – Image Acquisition
    WIYN Buffer
    ODI
    Data Capacitor
    Integration Server
    Data Capacitor
    Science Gateway
    End Users
    MDSS Archive
    TeraGrid Compute
    Resources
  • Overview – Image Copy
    WIYN Buffer
    ODI
    Data Capacitor
    Integration Server
    Data Capacitor
    Science Gateway
    End Users
    MDSS Archive
    TeraGrid Compute
    Resources
  • Overview – Image Transfer to Data Capacitor
    WIYN Buffer
    ODI
    Data Capacitor
    Integration Server
    Data Capacitor
    Science Gateway
    End Users
    MDSS Archive
    TeraGrid Compute
    Resources
  • Overview – Image Ingestions into the Archive
    WIYN Buffer
    ODI
    Data Capacitor
    Integration Server
    Data Capacitor
    Science Gateway
    End Users
    MDSS Archive
    TeraGrid Compute
    Resources
  • Overview – Clean Up
    WIYN Buffer
    ODI
    Data Capacitor
    Integration Server
    Data Capacitor
    Science Gateway
    End Users
    MDSS Archive
    TeraGrid Compute
    Resources
  • Overview – Automated Tier 1 Processing
    WIYN Buffer
    ODI
    Data Capacitor
    Integration Server
    Data Capacitor
    Science Gateway
    End Users
    MDSS Archive
    TeraGrid Compute
    Resources
  • Overview – User Driven Tier 2 Processing
    WIYN Buffer
    ODI
    Data Capacitor
    Integration Server
    Data Capacitor
    Science Gateway
    End Users
    MDSS Archive
    TeraGrid Compute
    Resources
  • Overview – User Driven Tier 2 Processing
    WIYN Buffer
    ODI
    Data Capacitor
    Integration Server
    Data Capacitor
    Science Gateway
    End Users
    MDSS Archive
    TeraGrid Compute
    Resources
  • Molecular Chemistry Gateway
    • Challenges:
    • Some apps have rich Client Gui’s, a challenge with asynchronous long running workflows
    • Workflow Verification Service
    • Parametric sweep scheduling, monitoring iteration steps, graphical composition
    • Human Interaction into Workflows
  • Example: Assimilation Workflow
  • Motif Network Genome Analysis
    Ensemble-type processing
    (minimal network reqs)
    Capacity-type computing
    Parallel processing
    Capability-type computing
    • Domain webs of large genomes
    • Input list of amino acid sequences
    • Identify all known domains
    • Construct webs
  • BioVLAB Cloud Use Case
  • Live Demo & Questions?