Apache AiravataBy Shameera Rathnayaka
About me
MSc Graduate student @ Indiana University
Developer of SciGaP Lab, Indiana University
Core Developer of Apache Airavata
Committer and Project Management Committee Member of Apache
Software Foundation (Apache Airavata, Apache Axis2)
Contributor of Apache Karaf, Apache Sandesha, Apache Rampart.
GSoC Student ( 2012 , 2013)
GSoC Mentor 2015
Goals for Module
Understand Apache Airavata design and implementation
See how we have incorporated lessons learned for running
production services.
See how Apache Airavata can be extended.
SEAGrid, PGA and Apache Airavata
SEAGrid and the PGA are clients to Apache Airavata middleware
services.
They run separately from Apache Airavata
The PGA is a reference implementation for the Airavata API
Using Airavata APIs, we can integrate other gateways
PGA clones
Jupyter notebooks
Your favorite web framework.
We’ll see how to clone the PGA in the next section
What is Apache Airavata?
An open source, openly governed software framework for executing
and managing computational jobs and workflows.
Managing the metadata associated with these jobs.
Supports local cluster, supercomputers, national grids, academic and
commercial clouds.
Basis of persistent gateway services platform (SciGaP)
Airavata Architectural Goals ...
Distributed Systems Concepts
Scalability
Fault Tolerance
Security
Component-Based Architecture
Loosely Coupled Components
Extension and expansion points
Operational Experience
Fault Handling: user, resource, and other errors
Experiment Recovery
Reliable Job Monitoring
High Level Architecture
AMQP
Messaging
API
server
Work Queue
Worker
Orchestrator Computational
Resources
Job Monitor
Registry
Workflow Engine
API Server
User
Support Multiple Gateways
AMQP
Messaging
API
server
Work Queue
Worker
Orchestrator
Computational
Resources
Job Monitor
Registry
Workflow Engine
API Server
Gateway B
Gateway C
Gateway AUser
User
User
Why Component Based Architecture Pattern?
Each component has specific work to do.
API Server – Hide all component from User
Orchestrator – Take Decisions and Selection
Worker – Execute set of Tasks
Registry - Data Catalog
Workflow Engine – Workflow Enactment
Easy to evolve with new technologies.
AMQP messaging provides inter-component communications.
You can add new components as subscribers to system messages
Scalability , Fault-tolerance and Recoverability
Airavata worker capacity can be increased and decreased on demand
to maintain performance and load spikes.
Airavata Workers scale horizontally.
Distribute jobs between workers using the internal work queue.
Worker
Worker
Work Queue
Orchestrator
Operational Fault Handling
User Errors
Pluggable job validation rules
Data Staging Errors
Retry on failure depend on situation ex: network glitches
Job Submission Errors
Retry on failure depend on situation ex: ssh connection issue, queue limit
Inform administrator in allocation issues
Verify Job submission
Job Failures on Remote Compute Resource
Copy standard out and error files
Security
It is important to have user authentication and authorization.
Airavata API security with WSO2 Identity Server.
Credential store manages all machine credentials.
SSH keys
SSH username & passwords.
Airavata provide user permission based on security role.
Super administrator
Administrator
User
Other Features …
Other Features …
Apache Thrift-based API allows users to use whatever language they
prefer in client side.
Or even you can plug different backend components written in
different languages.
Extendibility with multiple extension points.
Don’t require large infrastructure.
Incremental update with almost zero downtime.
Minimum overhead to dev ops.
Multiple Job Monitoring …
Airavata default job monitoring method is email
Airavata has SSH base job monitoring support too
Airavata UNICORE job monitoring
Plug your own job monitoring mechanism
Easy to plug new compute resources
Airavata comes with SLURM and PBS base job submission.
Airavata template mechanism make easy to plug different type of Job
submissions types.
Airavata has been extended to submit jobs to Jureca supercomputer
which has web service interface.
Integrating Jetstream as new computer resource and Jetstream cloud
provisioning.
How to get your client to talk with Airavata?
Airavata provide multiple client sdks.
We have heavily working with php (PGA web client), python (Jupyter
and dev test) and java (desktop client).
Use Airavata provided client sdk to talk directly to API Server.
Apache
Airavata
Apache Airavata
Road Map Highlight …
Airavata Data Management
Organization, Analystics, Collaboration
Airavata data analytic tool
Airavata workflow support
Airavata OpenStack support
Ansible scripts to automate large scale deployment in one click
Airavata component Docker containers
Airavata Mesos integration
Questions ?
syodage@indiana.edu
Additional Resources
Experiment Execution
Worker
Worker
Computational Resources
Email Notification
Submit Job
Submit Job
API
server
Experiment
API Server
Registry
Apache Zookeeper
Work Queue
Orchestrator
Execution Models
Experiment
Model
Process
Model
Task
Model
Job
Model
Task
Model
Task
Model
Task
Model
Process
ModelProcess
ModelProcess
Model
Task
ModelTask
ModelTask
Model
Execution Flow

Airavata_Architecture_xsede16

  • 1.
  • 2.
    About me MSc Graduatestudent @ Indiana University Developer of SciGaP Lab, Indiana University Core Developer of Apache Airavata Committer and Project Management Committee Member of Apache Software Foundation (Apache Airavata, Apache Axis2) Contributor of Apache Karaf, Apache Sandesha, Apache Rampart. GSoC Student ( 2012 , 2013) GSoC Mentor 2015
  • 3.
    Goals for Module UnderstandApache Airavata design and implementation See how we have incorporated lessons learned for running production services. See how Apache Airavata can be extended.
  • 4.
    SEAGrid, PGA andApache Airavata SEAGrid and the PGA are clients to Apache Airavata middleware services. They run separately from Apache Airavata The PGA is a reference implementation for the Airavata API Using Airavata APIs, we can integrate other gateways PGA clones Jupyter notebooks Your favorite web framework. We’ll see how to clone the PGA in the next section
  • 5.
    What is ApacheAiravata? An open source, openly governed software framework for executing and managing computational jobs and workflows. Managing the metadata associated with these jobs. Supports local cluster, supercomputers, national grids, academic and commercial clouds. Basis of persistent gateway services platform (SciGaP)
  • 6.
    Airavata Architectural Goals... Distributed Systems Concepts Scalability Fault Tolerance Security Component-Based Architecture Loosely Coupled Components Extension and expansion points Operational Experience Fault Handling: user, resource, and other errors Experiment Recovery Reliable Job Monitoring
  • 7.
    High Level Architecture AMQP Messaging API server WorkQueue Worker Orchestrator Computational Resources Job Monitor Registry Workflow Engine API Server User
  • 8.
    Support Multiple Gateways AMQP Messaging API server WorkQueue Worker Orchestrator Computational Resources Job Monitor Registry Workflow Engine API Server Gateway B Gateway C Gateway AUser User User
  • 9.
    Why Component BasedArchitecture Pattern? Each component has specific work to do. API Server – Hide all component from User Orchestrator – Take Decisions and Selection Worker – Execute set of Tasks Registry - Data Catalog Workflow Engine – Workflow Enactment Easy to evolve with new technologies. AMQP messaging provides inter-component communications. You can add new components as subscribers to system messages
  • 10.
    Scalability , Fault-toleranceand Recoverability Airavata worker capacity can be increased and decreased on demand to maintain performance and load spikes. Airavata Workers scale horizontally. Distribute jobs between workers using the internal work queue. Worker Worker Work Queue Orchestrator
  • 11.
    Operational Fault Handling UserErrors Pluggable job validation rules Data Staging Errors Retry on failure depend on situation ex: network glitches Job Submission Errors Retry on failure depend on situation ex: ssh connection issue, queue limit Inform administrator in allocation issues Verify Job submission Job Failures on Remote Compute Resource Copy standard out and error files
  • 12.
    Security It is importantto have user authentication and authorization. Airavata API security with WSO2 Identity Server. Credential store manages all machine credentials. SSH keys SSH username & passwords. Airavata provide user permission based on security role. Super administrator Administrator User
  • 13.
  • 14.
    Other Features … ApacheThrift-based API allows users to use whatever language they prefer in client side. Or even you can plug different backend components written in different languages. Extendibility with multiple extension points. Don’t require large infrastructure. Incremental update with almost zero downtime. Minimum overhead to dev ops.
  • 15.
    Multiple Job Monitoring… Airavata default job monitoring method is email Airavata has SSH base job monitoring support too Airavata UNICORE job monitoring Plug your own job monitoring mechanism
  • 16.
    Easy to plugnew compute resources Airavata comes with SLURM and PBS base job submission. Airavata template mechanism make easy to plug different type of Job submissions types. Airavata has been extended to submit jobs to Jureca supercomputer which has web service interface. Integrating Jetstream as new computer resource and Jetstream cloud provisioning.
  • 17.
    How to getyour client to talk with Airavata? Airavata provide multiple client sdks. We have heavily working with php (PGA web client), python (Jupyter and dev test) and java (desktop client). Use Airavata provided client sdk to talk directly to API Server. Apache Airavata Apache Airavata
  • 18.
    Road Map Highlight… Airavata Data Management Organization, Analystics, Collaboration Airavata data analytic tool Airavata workflow support Airavata OpenStack support Ansible scripts to automate large scale deployment in one click Airavata component Docker containers Airavata Mesos integration
  • 19.
  • 20.
  • 21.
    Experiment Execution Worker Worker Computational Resources EmailNotification Submit Job Submit Job API server Experiment API Server Registry Apache Zookeeper Work Queue Orchestrator
  • 22.
  • 23.

Editor's Notes

  • #7 1. Each component has specific work to do. 2. AMQP messaging provides inter-component communications. 3. Easy to evolve with new technologies. 4. You can add new components as subscribers to system messages
  • #11 Keep this simple You can deploy airavata setup with one worker. If that worker only have enough resoruces to handle 1000 live jobs and at some point you need to support for more than 1000 then you can add another worker instance.
  • #12 Categrorize – Infrastructure, Application, User, Gateway Operator (Allocations)
  • #17 SLURM- The Simple Linux Utility for Resource Management (Slurm) is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters PBS – Portable Batch System, Job Scheduling Jetstream, led by the Indiana University Pervasive Technology Institute (PTI), will add cloud-based computation to the national cyberinfrastructure. Jetstream will be attractive to communities who have not been users of traditional HPC systems, but who would benefit from advanced computational capabilities. SLURM : comet , stampede PBS : bigred2 , karst
  • #18 If you attended morning session, Sudhakar did a nice demo with his seagrid desktop client. You already saw how pga works. you will be able to see some of api call with jupyter notebook session which use python sdk.
  • #19 If you attended morning session, Sudhakar did a nice demo with his seagrid desktop client. You already saw how pga works. you will be able to see some of api call with jupyter notebook session which use python sdk.