• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
OGCE MSI Presentation
 

OGCE MSI Presentation

on

  • 694 views

OGCE Presentation by Marlon Pierce at University of Minnesota Supercomputing Institute, February 11, 2011

OGCE Presentation by Marlon Pierce at University of Minnesota Supercomputing Institute, February 11, 2011

Statistics

Views

Total Views
694
Views on SlideShare
694
Embed Views
0

Actions

Likes
0
Downloads
8
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    OGCE MSI Presentation OGCE MSI Presentation Presentation Transcript

    • Open Gateway ComputingEnvironments: Software for Science GatewaysMarlon Pierce, Suresh Marru, RaminderSingh, Gerald Guo, ArchitKulshrestha, Ye Fan, PatanachaiTangchaisin, and collaborators.
    • What Is a Science Gateway?• User Interface and supporting Web services to scientific applications, data sets, and resources running on cyberinfrastructure. – Science portals, Grid Computing Environments, … – Broaden and simplify usage• Cyberinfrastructure: Distributed computing resources and overlaying middleware for scientific computing. – Prominent examples include TeraGrid, Open Science Grid – Middleware includes Globus, Condor, iRods/SRB, … – Some of these approaches being pushed by scientific cloud computing – That is another topic
    • TeraGrid is one of the largest investments in shared CI from NSF’s Office of Cyberinfrastructure
    • Cyberinfrastructure Layers Web/Gadg Web Enabled Gateway User Web/Gadge et Desktop Abstraction t InterfacesInterfaces Container Applications Interfaces Application Fault User Information Monitoring Abstractions Tolerance Management Services Gateway Software Workflow Provenance Auditing & Registry & Metadata Security System Reporting Management Resource SSH & ResourceMiddleware Cloud Interfaces Grid Middleware ManagersCompute Computational Computational Local ResourcesResources Clouds Grids Color Coding OGCE Gateway Components Complimentary Gateway Components Dependent resource provider components
    • Open Gateway Computing Environments• The OGCE team develops software for building secure, Web-based Science Gateways – Chemistry, Bioinformatics, Biophysics, Environmental Sciences• OGCE is funded by the National Science Foundation’s Software Development for Cyberinfrastructure (SDCI) program.
    • OGCE Funds Software Lifecycle
    • OGCE SoftwareName DescriptionOGCE Gadget AnOpenSocial and Google gadget-compatible WebContainer container for running Web gadgets.GFAC A Web service for generating, securely invoking, and managing the lifecycle of scientific applications on Grids and CloudsWorkflow Tools Composer (XBaya), enactment (“interpreter”) engines, event system, and service registry to support scientific workflows on Grids and Clouds.Gadgets and Tools for building secure Google-gadget basedGadget Building Science Gateways.Tools
    • Putting It All Together
    • OGCE Components in ActionFeaturedGatewa OGCE Components UsedyUltraScan GFAC scientific application management serviceGridChem,Param XBaya workflow composer, OGCE MessengerChem Service, XRegistrySimpleGrid OGCE Gadget Container (in development)Purdue CCSM Gadget Container and gadget building librariesPortal (in development)BioVLAB GFAC, XBaya,XRegistry, Workflow Interpreter Service
    • Software Strategy• We develop downloadable, packaged, open source software • SourceForge• Focus: a) gadget container and b) tools for running science applications and workflows on grids and clouds.• Provide a tool set that can be used in whole or in part. – If you just want GFac, then you can use it without buying an entire framework.• Out of our scope: visualization, security, information services, data and metadata provenance and management. – MyProxy, TG IIS, Globus, Condor, XMC Cat, iRods, etc.
    • Apache Incubators• Joining Apache is key to our software sustainability strategy – Open source licensing, meritocracy, visibility• Vigyan: tools for science gateway services and workflows – XBaya, GFAC, Messenger, XRegistry – Collaboration with WS02/LSF, IBM – Builds on Apache Axis2, Apache ODE• Rave: OpenSocial gadget manager, general purpose gadgets – Collaboration with Hippo, Mitre, SURFnet – Builds on Apache Shindig
    • The OGCE Gadget ContainerManaging layouts, look and feel, and behind-the-scenes services for aggregated Web gadgets
    • The OGCE Gadget Container allows you to build portals out of public and private Google Open Social gadgets. Supports HTTPS. Downloadable, packaged software.
    • The OGCE Application Registry gadget allows users to interactively register hosts and applications that are dynamically wrapped as Web services.
    • Google Gadget-Based Science Gateways PolarGridLEAD
    • Mobile Support Gadget Container is built with HTML, JavaScript and CSS. Works in both iPhone and Android native browsers with out modification. Developing layout managers better suited to limited screen real estate.
    • Feature Groups FeaturesLook and Feel Tabbed and Tree layout managers, 2 and 3 column layouts, default maximized views of gadgets, customizable color styling.Security Supports end-to-end SSL between browser, container, and gadgets; OpenIDauthentation; OAuth-secured gadgets; MyProxy logins; limited Grid credential sharing between gadgets; CILogon for InCommon loginInter-Gadget Supports OpenAjax publish-subscribe style messagingCommunication between gadgets. PMRPC JavaScript messaging support in developmentREST Service API Layouts, logins, sign-ups, user administration, user identification, and Grid credentials all accessible via REST service calls as well as the user interface.Open Source Social All code is open source and builds on Apache Shindig 2.0.NetworkingGadget Support for GWT-based gadgets and YUI JavaScript libraries inDevelopment development.
    • SimpleGrid GadgetsRequires YUI integration, OpenAJAX messaging, REST APIs
    • Bioinformatics Workflows in the Cloud
    • BioVLAB Architecture
    • BioVLABApplication Deployment ProcedureUser • Develop a command line app. Gfac Registration form • Install the app. in Amazon EC2Admin • Let the app. store any output to Amazon S3 • Make a virtual machine image • Register the app. by using Gfac • Instantiate EC2 and run the app. by usingUser XBaya
    • BioVLAB-Microarray• Analysis of highthroughput microarrayexperiment• Multiple tasks in a singlebatch• Output of a task canplugged into another task• Repeat the same set oftasks with small changes ofparameters
    • BioVLAB-mCpG
    • OGCE Layered Workflow Architecture: Derived from LEAD Workflow System Flex/Web XBaya GUI Composition Workflow (Composition, Deplo Interfaces (Design ying, Steering & & Definition) Monitoring) Gadget Interface for Input Binding BPEL 2.0 Python Scufl Workflow Specification BPEL 1.0 Java Code Pegasus DAG Apache Dynamic Condor ODE Enactor DAGMan Workflow Execution & Control Engines GBPEL Jython Interpreter Taverna
    • UltraScan Science Gateway Biophysics gateway forultracentrifugation experiment data analysis
    • UltraScan2 High Level Overview User Web Server MySQL DB US LIMS GridControl TeraGrid UTHSCSA Jacinto TIGRE/Globus High Performance Terascale storage Network Computing Clusters
    • UltraScan TG Usage July 2007-June 2010
    • UltraScan Collaboration• Immediate Goals: Use GFAC as a replacement job submission service. – GRAM 2, 4, 5 independence – Significant effort into GRAM5 testing on Ranger.• Longer term goals – Integrate with TG information services to provide better job scheduling. • OGCE Resource Prediction Service – Support UNICORE job Current Architecture management.
    • UltraScanproblems Solution provided by OGCEGateway code can only submit to resources GFAC supports different provider likewith GRAM4 installed and running. GRAM2/4/5, Condor, Local, Remote using SSH keys. There is a generic GUI interface to configure them all.Adding new resource is time consuming User need to fill two web form to configure new resource.Local cluster needed to install GRAM4. We can directly invoke mpirun on local or remote cluster using local/remote providers.TACC resources like Lonestar and Ranger Its was easy to start using GRAM5 in GFAC butdecided not to install GRAM4 and move to time consuming to GRAM5 to runoperationallyGRAM5. on these resources.Problem related to job failure and missing Retry mechanism for certain GRAM errorstatus. codes but still trying to find how to deal with missing status or reconnect to those jobs as Globus api does not support that.Restart of jobs were not provided in Gateway Added restartjob support from checkpointeven application supports check pointing. files.Ultrascan3 need to rewrite all these Provided REST interface to OGCE services andcomponent again as it using different now different language clients can call sametechnology. interfaces for required operations.
    • GFac Current & Future Features Globus Input Registry Scheduling Monitoring Handlers Interface Interface Interface Campus Resources Apache Axis2 Output Fault Data Management Amazon Handlers Tolerance Abstraction Eucalyptus Auditing Checkpoint Job Management Unicore Support Abstraction CondorColor Coding Existing Features Planned/Requested Features
    • Gram5 Testing• Developed Testing harness to run different cases.• Started with small number of jobs and increased the concurrency later• Watched job behavior of the job on resource and monitored the gram log – There were lot of issue which we found from the logs and working with Globus team to fix them• Recorded all the job run data to create a google gadget to create graph for different runs on different resources.
    • TG Resources and patterns Version Resource Endpoint GT 5.0.2 QueenBee queenbee.loni-lsu.teragrid.org:2120/jobmanager-pbs GT 5.0.2 Ranger login5.ranger.tacc.teragrid.org:2120/jobmanager-sge gatekeeper.lonestar.tacc.teragrid.org:2120/jobmanager- GT 5.0.2 Lonestar lsfPatterns: Concurrent jobs Batch Size Total jobs Job Status Pass : Fail 1 10 10 10:0 3 10 30 30:0 5 10 50 50:0 10 10 100 20:0 20 10 200 40:0 50 10 500 100:0 100 10 1000 200:0 200 5 1000 Not tested (Need allocation) Not tested (Need allocation) 500 2 1000
    • GFAC Integration• UltraScan job submission previously relied on GRAM4 GFAC integrated as middleware to abstract submission process GRAM5, UNICORE and any future mechanism• Science Gateway is in active use Initial testing done on IU quarry node Extensively tested job submission process using GFAC to LONIsQueenBee and TACCs Ranger Deployed 26 October 2010 Implementation details available http://wiki.bcf.uthscsa.edu/cauma/wiki/US2GFACTe sting
    • GridChem/ParamChemGateways for Computational Chemistry
    • GridChem Science Gateway• A chemistry/material Science Gateway for running computational chemistry codes, workflows, and parameter sweeps.• Integrates molecular science applications and tools for community use.• 400+ users heavily using TeraGrid. One of the consistent top5 TeraGrid Gateway users.• Supports all popular Chemistry applications including Gaussian, GAMESS, NWChem, QMCPack, Amber and MolPro, CHARMM• ParamChemis a follow-on project to develop workflows for chemical parameter studies and provide the infrastructure to execute them.
    • Empirical ForceFields Parameterization Need Process Lack of Accurate Force Fields Produce Erroneous Property EstimationFig. 1. Errors (V) in electrostatic potential on a surface at 1.8 times vander Waals radii around N-methylpropanamide for two models. (Left) Point charges; (right) charge, dipole, and quadrupole on C, N, and O; charge anddipole on H. The errors are much reduced in themultipole approach A. J. Stone Science 321, 787-789 (2008) Published by AAAS Vanommeslaeghe et al. J. Comp.Chem 2010, 31, 671-690
    • Cyberenvironments for Parameterization Computational Reference Data Generation
    • Conclusions• Our project focus is providing long-term sustainable software for science gateways.• What we learned: – Try to serve a few high profile collaborators very well. • Derive good software engineering practices from this: versioning, code reviews, testing , packaging, portability, … – Define and keep to your project’s scope. – Let the collaborations determine the direction of innovation • This is more than just getting “customer requirements”. Collaborators expect you to know your field and guide them.• There is a tension between this and research – “Collaborators, not customers” is the resolution.
    • More Information• OGCE Web Site: http://www.collab-ogce.org• News Feed/Blog: http://collab-ogce.blogspot.com• Contact us: – ogce-discuss@googlegroups.com – http://groups.google.com/group/ogce-discuss/• Software Downloads: Software is available as tagged SVN releases from our SourceForge project. – http://sourceforge.net/projects/ogce/ – See http://www.collab- ogce.org/ogce/index.php/Portal_download
    • Backup Slides
    • OGCE Partners and PeopleInstitution PeopleIndiana Marlon Pierce, Suresh Marru, RaminderUniversity Singh, ArchitKulshrestha, Gerald GuoNCSA/UIUC SudhakarPamidighantam, Shaowen Wang, Yan LiuPurdue Carol Song, Lan Zhao, David Braun,University Shawn WuUTHSCSA Emre Brookes, BorriesDemeler, Bruce Dubbs
    • Award Highlights• Full Circle Development – Directly fund both software developers and gateway consumers.• Directly supported (non-IU) gateways: – UltraScan (UTHSCSA), GridChem (NCSA), SimpleGrid (UIUC), Purdue CCSM and Environmental Gateways – Among the most used TG gateways.• Sustainability strategy: Apache Incubator for workflow suite of tools – XBaya, GFac, and supporting services.
    • SimpleGrid, GISolve• Short term goal: develop SimpleGrid Gadgets deployable into gadget container. – Must meet security requirements – Support PHP development – Support interactivity requirements • Integrate YUI JavaScript libraries with Gadget JavaScript.• Longer term goals: investigate workflow, job management tools. Apply to GISolve
    • Purdue CCSM and Data Portals• Short terms goals: Develop CCSM and data management gadgets and necessary backing middleware. – Interactivity and security requirements. – Significant requirements overlap with SimpleGrid• Longer term goals: Build gateways out of gadgets hosted by multiple containers; examine workflow and other tools.
    • Open Gateway Computing Environments• The OGCE team develops software for building secure, Web-based Science Gateways – Chemistry, Bioinformatics, Biophysics, Environmental Sciences• OGCE is funded by the National Science Foundation’s Software Development for Cyberinfrastructure (SDCI) program.
    • More Information• OGCE Web Site: http://www.collab-ogce.org• News Feed/Blog: http://collab-ogce.blogspot.com• Contact us: – ogce-discuss@googlegroups.com – http://groups.google.com/group/ogce-discuss/• Software Downloads: Software is available as tagged SVN releases from our SourceForge project. – http://sourceforge.net/projects/ogce/ – See http://www.collab- ogce.org/ogce/index.php/Portal_download
    • The OGCE Gadget ContainerManaging layouts, look and feel, and behind-the-scenes services for aggregated Web gadgets
    • • MicroRNAs (miRNAs) • small (19-22 nucleotide) non- protein-coding RNA molecules • regulate the expression of specific gene products • effect translational blockade or message degradation• MMIA: microRNA and mRNAintegrated analysis BioVLAB-MMIA • Computation in the Cloud • MMIA expertise in workflow
    • BioVLAB-Microarray• Analysis of highthroughput microarrayexperiment• Multiple tasks in a singlebatch• Output of a task canplugged into another task• Repeat the same set oftasks with small changes ofparameters Bac
    • EXPERIMENTSBac
    • • MicroRNAs (miRNAs) • small (19-22 nucleotide) non- protein-coding RNA molecules • regulate the expression of specific gene products • effect translational blockade or message degradation• MMIA: microRNA and mRNAintegrated analysis BioVLAB-MMIA • Computation in the Cloud • MMIA expertise in workflow Bac
    • Bac
    • BioVLAB-mCpGBac
    • BioVLAB Summary• Usability (Reconfigurable environments) – As an adoption of the SaaS model of Cloud Computing for BioVLAB, end-users only need to launch the pre-composed BioVLAB workflows. With XBaya, users can easily customize it by modifying just a few components and input parameters.• Flexibility (Full privileges) – As a way of the IaaS model, BioVLAB workflow developers can have flexibility for handling computing resources and implementing applications with Amazon Cloud. They can choose specific systems resources to satisfy their needs with a fully controlled access power.• Reducing processing time & Cost effective – Users can have number of servers, and control their usage time as they want. That reduces researching cost and initial time to construct physical infrastructure for research. Bac
    • Background: What is AUC ? AUC is an important technique for the solution study ofmacromolecules Molecules are not fixed to a microscope grid Molecules are not distorted by crystal packing forces (vs X-Raycrystallography) Very large size range (complements cryo-EM and NMR) Dynamic processes can be studied Conformational changes
    • Background: What is AUC ? Sample placed in cell Run Ultracentrifuge Usually 20-60k RPM Collect data 4 to 24 hours or more Analyze the dataBac
    • TG SG Usage 2007-10• Job statistics for UltraScan projectfor approximately the last 4 years.• Only partial data is available for2007 (2nd half) and 2010 (thruJune), and only successful runs areincluded. •Totals of CPU hours consumed from TeraGrid, UTHSCSA and international resources •Number of investigators whose data were analyzed (left Y-axis), and number of submitted jobs (right Y-axis).• Both panels indicate increasingusage and need for TeraGridresources and an increasingnumber of investigators requiringaccess to these resources. Bac
    • GFAC Integration UltraScan job submission previously relied on GRAM4 GFAC integrated as middleware to abstract submission process  GRAM5, UNICORE and any future mechanism Science Gateway is in active use Initial testing done on IU quarry node Extensively tested job submission process using GFAC to LONIsQueenBee and TACCs Ranger Deployed 26 October 2010 Implementation details available http://wiki.bcf.uthscsa.edu/cauma/wiki/US2GFACTesting Bac
    • User Community: Publications Since the development of our advanced methods, virtually every publication from our lab has used these methods We currently count 35 peer reviewed journal publications and poster abstracts Many additional presented talks where these methods have provided important new detail to the investigations of biological as well as synthetic polymer systems We are aware of at least another 25 publications that were facilitated by our methods from other laboratories using our TeraGrid applications Bac
    • Conclusion• We focus initially on one component per gateway. – SimpleGrid, CCSM, Data Portal: gadgets • Other gadget based gateways at UC – GridChem: Xbaya – UltraScan: GFac• Goal is to establish an Apache-style meritocracy for contributed code.• Making distributed teams work: hacking retreats.
    • OGCE Gateway Tool Adaption & Reuse LEAD LEAD Experiment Builder, XRegistry Interface GFac, XBaya, XRegistry, FTR Eventing System GridChem Xbaya, GC Middleware GridChem Ultrascan Resource OGCE GFac, Eventing Discovery Service Re- System engineer, Gen OVP/ BioVLab TeraGrid eralize, Build, RST/ MIG XBaya, GFac User Portal Test and GPIR, File Release ODI Browser Workflow Suite, Gadget Container OGCE Team Bio Drug Screen Gadget Swarm->GFacContainer, GTLab, Java script Cog, XRegistry EST Pipeline Interface, Experiment Builder, Axis2 Swarm->GFac Gfac, Axis2 Eventing System, Resource Future Grid Prediction GFac, Xbaya, … 61
    • Software Strategy• Focus on gadget container and tools for running science applications on grids and clouds.• Provide a tool set that can be used in whole or in part. – If you just want GFac, then you can use it without buying an entire framework.• Outsource security, information services, data and metadata, etc to other providers. – MyProxy, TG IIS, Globus, Condor, XMC Cat, iRods, etc.
    • Advanced Support Scenarios• GridChem/ParamChem workflow support• UltraScan Job Submission (GFAC)• EST Pipeline – Bioinformatics pipeline for managing mass job submission.
    • More Information• This is downloadable, packaged software. – Apache Maven build system provides everything you need to to build the gadget container, gadgets, workflow composer, and backing services. – Get code by anonymous SVN checkout.• Email: mpierce@cs.indiana.edu, smarru@cs.indiana.edu, ogce- discuss@googlegroups.com• OGCE Web Site: www.collab-ogce.org• Blog/News Feed: http://collab- ogce.blogspot.com/
    • Acknowledgements and People• Funding by TeraGrid GIG, RP and by OCI SDCI• IU: Marlon Pierce, Suresh Marru, Raminder Singh, Archit Kulshrestha, Zhenhua Guo• TACC: Maytal Dahan, Rion Dooley• SDSC: Nancy Wilkins-Diehr, Jeff Sale• SDSU: Mary Thomas
    • Gateway Computing Environments (GCE10)
    • Molecular Force Field Cyberenvironments Parameter Initialization and optimization WorkflowParameter Workflowdefinitions Manager OptimizationModel/Reference Data MonitorDefinition Optimization Optimization Job Merit Function Incomplete? Completed? Specification Expert Paramater testing Model Interface Optimization Methods Choice Successful Testing Consistency Checker Paramater Sensitivity Analysis Update Parameter Database with new set Optmization Job Launcher Notification of End of Workflow
    • OGCE Alumni• We also gratefully acknowledge the contributions of participants in previous incarnations of the OGCE: – TACC: MaytalDahan, Rion Dooley – SDSU: Mary Thomas – SDSC: Nancy Wilkins-Diehr, Jeff Sale – LSF: SrinathPerera, SanjivaWeeravarna