Services for Science v2 (APAN26)

888 views

Published on

Talk given at APAN26 in beautiful Queenstown, New Zealand. Updates my INGRID talk with new material on caBIG--including some nice slides provided by Ravi Madduri.

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
888
On SlideShare
0
From Embeds
0
Number of Embeds
15
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • I would mention data integration here: integration of different data types: tissue, (pre)clinical, genomics, proteomics….
  • Developer benefits:
    Developers spend time developing scientific applications instead of rewriting code to turn into services.
    Developer time spared through code reuse—legacy apps are wrappered into services instead of rewritten as services
    Learning gRavi easier than learning service development—especially marshaling of security resources!
    Traditional applications are not secure and can not provide an audit trail
    Scientific benefits:
    Scientists spend time developing workflows rather than moving data from application to application
    Scientists interact with GUI to focus on scientific process not how to run applications.
    Services provide easy access to all scientific tools in one integrated GUI.
    Access to grid resources promotes cross disciplinary collaboration as it becomes easier to share data and applications across institutional boundaries
  • Use funnel animation?
  • The GT4 plug-in support semantic based service query. Users can input multiple (up to three, in current scenario three is enough. We can add more in our program upon request) service query criteria and input the corresponding value.
    We can combine multiple criteria. The initial GUI only shows one query criteria, but more can be added by clicking “Add Service Query” button. For example, we can query the caGrid services whose “Research Center” name is “Ohio State University”, with Service Name “DICOMDataService”, and has operation “PullOp”.
  • Services for Science v2 (APAN26)

    1. 1. Ian Foster Computation Institute Argonne National Lab & University of Chicago Services for Science
    2. 2. 2 Thanks! DOE Office of Science NSF Office of Cyberinfrastructure National Institutes of Health Colleagues at Argonne, U.Chicago, USC/ISI, OSU, Manchester, and elsewhere
    3. 3. 3 Scientific Communication, ~1600 Brahe Kepler
    4. 4. 4 1980
    5. 5. 5 Scientific Communication, ~2000 Data ArchivesData Archives User Analysis toolsAnalysis tools Gateway Figure: S. G. Djorgovski Discovery toolsDiscovery tools Service-Oriented Science
    6. 6. 6 Application Scenario Location A Microarray, Protein, Image data Location B Microarray, Protein, Image data Location C Microarray, Protein, Image data Location C Image Analysis Location D Image Analysis Microarray and protein databases at other institutions Different database systems, data representations, security Different program invocation, remote access, data transfer
    7. 7. 7 caBIG: sharing of infrastructure, applications, and data. Data Integration! Services & Cancer Biology Globus
    8. 8. 8 Service-Oriented Science People create services (data, code, instr.) … which I discover (& decide whether to use) … & compose to create a new function ... & then publish as a new service.  I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!  I hope that this “someone else” can manage security, reliability, scalability, … !! “Service-Oriented Science”, Science, 2005
    9. 9. 9 Creating Services People create services (data, code, instr.) … which I discover (& decide whether to use) … & compose to create a new function ... & then publish as a new service.  I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!  I hope that this “someone else” can manage security, reliability, scalability, … !! “Service-Oriented Science”, Science, 2005
    10. 10. 10 Anatomy of a Service op1 opN (meta)data Implementation(s) Clients Registry Management Clients … Service Service Attribute Authority Attribute Authority Persistence
    11. 11. 11 Creating Services (~2005) “This full-day tutorial provides an introduction to programming Java services with the latest version of the Globus Toolkit version 4 (GT4). The tutorial teaches how to build a Java Service that makes use of GT4 mechanisms for state management, security, registry and related topics.”
    12. 12. 12 Appln Service Create Index service Store Repository Service Advertize Discover Invoke; get results Introduce Container Transfer GAR Deploy Ohio State University and Argonne/U.Chicago Creating Services in 2008 Introduce and gRAVI Introduce Define service Create skeleton Discover types Add operations Configure security Grid Remote Application Virtualization Infrastructure Wrap executables Globus
    13. 13. Demonstration: Creating Services Introduce + gRAVI Shannon Hastings Scott Oster David Ervin Stephen Langella Kyle Chard Ravi Madduri
    14. 14. 14Center for Enabling Distributed Petascale Science Workflow Automation at DOE Facilities Automation Reproducibility Security Reusability Storage Metadata Analysis Visualization Advanced Photon Source
    15. 15. 15 Discovering Services People create services (data or functions) … which I discover (& decide whether to use) … & compose to create a new function ... & then publish as a new service.  I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!  I hope that this “someone else” can manage security, reliability, scalability, … !! “Service-Oriented Science”, Science, 2005
    16. 16. 16  The ultimate arbiter?  Types, ontologies  Can I use it?  Billions of services Discovering Services Assume success Syntax, semantics Permissions Reputation A B
    17. 17. 17 Discovery (1): Registries Globus
    18. 18. 18 Discovery (2): Standardized Vocabularies Core Services Grid Service Uses Terminology Described In Cancer Data Standards Repository Enterprise Vocabulary Services References Objects Defined in Service Metadata Publishes Subscribes to and Aggregates Queries Service Metadata Aggregated In Registers To Discovery Client API Index Service Globus
    19. 19. 19
    20. 20. 20 Discovery (3): Tagging & Social Networking GLOSS: Generalized Labels Over Scientific data Sources (Foster, Nestorov)
    21. 21. 21 Discovery (3): Tagging & Social Networking David de Roure, Carole Goble, et al.
    22. 22. 22 Composing Services People create services (data or functions) … which I discover (& decide whether to use) … & compose to create a new function ... & then publish as a new service.  I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!  I hope that this “someone else” can manage security, reliability, scalability, … !! “Service-Oriented Science”, Science, 2005
    23. 23. 23 Composing Services: E.g., BPEL Workflow System Data Service @ uchicago.edu Analytic service @ osu.edu Analytic service @ duke.edu <BPEL Workflow Doc> <Workflow Inputs> <Workflow Results> BPEL Engine link caBiG: https://cabig.nci.nih.gov/; BPEL work: Ravi Madduri et al. link link link See also Kepler & Taverna Globus
    24. 24. 24 Composing Services: Taverna caGrid Scavenger with semantic/metadata- based caGrid service query A sample caGrid workflow Globus
    25. 25. 25 Composing Services Globus
    26. 26. Demonstration: Composing Services Taverna + GT4 Taverna team Wei Tan Ravi Madduri
    27. 27. 27 Publishing Services People create services (data or functions) … which I discover (& decide whether to use) … & compose to create a new function ... & then publish as a new service.  I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!  I hope that this “someone else” can manage security, reliability, scalability, … !! “Service-Oriented Science”, Science, 2005
    28. 28. 28 Publishing Services Description  Syntax, semantics State  Availability, load, … Policies  Who, what, when, … Hosting  Location, scalability, …
    29. 29. 29 Authorization: SAML & XACML VOMS Shibboleth LDAP PERMIS… GT4 Client GT4 Server PDP Attributes Authorization Decision PIP PIP PIP SAML XACML Globus
    30. 30. 30 Hosting Services People create services (data or functions) … which I discover (& decide whether to use) … & compose to create a new function ... & then publish as a new service.  I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!  I hope that this “someone else” can manage security, reliability, scalability, … !! “Service-Oriented Science”, Science, 2005
    31. 31. 31 The Two Dimensions of Service-Oriented Science Decompose across network Clients integrate dynamically Select & compose services Select “best of breed” providers Publish result as new services Decouple resource & service providers Function Resource Data Archives Analysis tools Discovery toolsUsers Fig: S. G. Djorgovski
    32. 32. 32 The geWorkbench/caGrid/TeraGrid Interface
    33. 33. 33 Putting It Together for the Example Scenario Location A Microarray, Protein, Image data Location B Microarray, Protein, Image data Location C Microarray, Protein, Image data Location C Image Analysis Location D Image Analysis caGrid Service Interfaces caGrid Environ- ment Registered Object Definitions Advertise- ment Log on, Grid credentials Query and Analysis Workflow Discovery Microarray & protein databases at other institutions
    34. 34. 34 Lessons Learned A convenient higher-level abstraction Suitable for a subset of scientific use cases Infrastructure need to be sustainable Integrates well with hospital/cancer center/experimental facility IT infrastructure Workflows are attractive to users Scalability and provenance are important No vendor lock-in (if you are careful) User experience remains ambiguous Early adopters are enthusiastic (50+ services)
    35. 35. 35 Services for Science A new approach to communicating A (not-so new) approach to structuring systems They’re real Excellent infrastructure and tools (Globus, Introduce, gRAVI, Taverna, Swift, etc., etc.) Substantial numbers of services out there They’re challenging Sociology: incentives, rewards Infrastructure: hosting Provenance: justifying “results” Scaling: services, requests

    ×