• Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,390
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
24
Comments
0
Likes
2

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. caGrid 1.0 Service Infrastructure ISMB/ECCB 2007 Bioinformatics Open Source Conference Vienna, Austria July 18, 2007 Avinash Shanbhag Director, Core Infrastructure Engineering National Cancer Institute Center for Biomedical Informatics and Information Technology USA
  • 2. Agenda
    • High Level Overview
    • caGrid Service Architecture
    • Component Highlights
    • Project Resources
  • 3. What is caBIG?
    • Common, widely distributed infrastructure that permits the cancer research community in USA to focus on secure data sharing
    • Shared, harmonized set of terminology, data elements, and data models that facilitate information exchange
    • Collection of interoperable applications developed to common standards
    • Cancer research data is available for mining and integration
  • 4. caGrid – Service Infrastructure supporting caBIG
    • Requirements:
      • Support scientific requirements: Use cases from cancer research community
      • Support functional requirements: identifiers, workflow, query, etc
      • Support non-functional requirements: security, reliability, performance, etc
    • Principles:
      • Driven by cancer research community requirements
      • caBIG Principles
        • Open Source, Open Access, Open Development
        • Federated
        • Syntactic and Semantic Interoperability
      • Services-Oriented Architecture
      • Metadata driven and implements Virtualization
      • Standards based
  • 5. History of caGrid
  • 6. What is a Community Provided caGrid Service?
    • Standardized, common pattern and mechanism for remote access
      • Language and implementation technology independent
    • Common security infrastructure for authentication and authorization
    • Standardized service metadata models and metadata advertisement mechanisms
    • Community provided service types:
      • Data Services
        • Expose data to the grid in a unified way
      • Analytical Services
        • Expose analytical operations to the grid
  • 7. caGrid Services - Strongly Typed and Semantically Rich
    • Object Oriented APIs and data resources are developed using Object types and UML information models registered in the caDSR
    • These systems are grid-enabled by defining a grid service interface that defines the functionality to be exposed to the grid
    • The grid service interface uses the same Object types as the existing system, but leverages a platform and language neutral representation (XML) of them
    • The grid service implementation maps service invocations to API calls or queries into the existing system
  • 8. Service Layers
  • 9. Service Layers: caBIO Data Service example
    • Common Data Service Operations (WSDL)
    • CQL, CQLResult, Data Service Faults (XSD)
    • caBIO Schemas (XSD)
    • caGrid Metadata Schemas (XSD)
    • WS-Enumeration Operations and Types (WSDL, XSD)
    • Introduce-managed Security constraints
    • GTS-managed Trusted Authorities
    • CSM/Grid Grouper Authorization
    • Introduce-generated ServiceMetadata
    • Introduce-generated DomainModel
    • Introduce-generated Resource to manage metadata
    • Introduce-generated Resources to manage enumerations
    • Introduce-generated code to manage service group registration and maintenance
    • Introduce managed configuration points:
      • Index Service Location
      • Data Service Component Implementations (CQL Processor, Validators)
      • ApplicationService Information
      • Other options
    • Introduce-provided common operation implementations (Resource Property, Security Metadata)
    • caGrid-provided CQL implementation to query ApplicationService
  • 10. caGrid Components
    • Leverage existing technologies:
      • caDSR, EVS, Mobius GME: Common data elements, controlled vocabularies, schema management
      • Globus Toolkit (currently version 4.0.3)
        • Core grid services infrastructure
        • Service deployment, service registry, invocation, base security infrastructure
    • Additional Core Infrastructure
      • Higher-level security services
      • Grid service access to metadata components (caDSR, EVS, GME, etc)
      • Workflow, Identifier, Federated Query services
    • Service Provider Tooling (Introduce)
      • Graphical service development and configuration environment
      • Abstractions from grid service infrastructure for Data and Analytical services
      • Deployment wizards
    • Client Tooling
      • Installer
      • High-level APIs for interacting with core components and services
      • Graphical Tools (administration tools, sample applications, etc)
    • Production Deployment and Support of Infrastructure Services
  • 11. caGrid Production Environment
  • 12. caGrid Projects
    • The caGrid release is oriented around a number of individual projects
    • Build process manages inter-project dependencies
    • Each project provides a specific set of functionality, and is self contained once caGrid is built
    • Grid Services:
      • authentication-service, cadsr, dorian, evs, fqp, gme, gridgrouper, gts, index, syncgts, workflow, ws-naming, ws-transfer
    • Grid Service Components and Extensions:
      • authz, bulkDataTransfer, cabigextensions, data, sdkQuery, sdkQuery32, service-security-provider, ws-enum, ws-handlesystem
    • Utilities and APIs:
      • AntInstallerFramework, core, discovery, graph, gridca, metadata, metadatautils, opensaml
    • Applications:
      • installer, introduce, portal, security-ui
  • 13. Metadata Services
    • Cancer Data Standards Repository (caDSR)
      • caBIG projects register their data models as Common Data Elements (CDEs) which are semantically harmonized and then centrally stored and managed the caDSR
      • The caDSR grid service provides:
        • Model discovery and traversal
        • caGrid standard metadata generation capabilities
    • Enterprise Vocabulary Services (EVS)
      • EVS is set of services and resources that address the need for controlled vocabulary
      • The EVS grid service provides:
        • Query access to the data semantics and controlled vocabulary managed by the EVS
    • Global Model Exchange (GME)
      • GME is a DNS-like data definition registry and exchange service that is responsible for storing and linking together data models in the form of XML schema.
      • The GME grid service provides:
        • Access to the authoritative structural representation of data types on the grid
    • Globus Information Services: Index Service
      • The Globus Information Services infrastructure provides a generic framework for aggregation of service metadata, a registry of running Grid services, and a dynamic data-generating and indexing node, suitable for use in a hierarchy or federation of services
      • The Index grid service provides:
        • Yellow and white pages for the grid
  • 14. caGrid Security Components
    • Dorian
      • Grid User Account Management
      • Enables Identity Management and Federation
    • Authentication Service
      • Provides a uniform authentication interface in which applications can be built on, and a framework for issuing SAML assertions for existing credential providers such that they may easily integrated with Dorian and other grid credential providers
    • Grid Trust Service (GTS)
      • Creation and Management of a federated trust fabric.
      • Supports applications and services in deciding whether or not signers of digital credentials/user attributes can be trusted.
    • Grid Grouper
      • Grid Group / VO Management
      • Enables Group/VO Based Authorization
    • Authorization Support
      • Provides a framework to perform service authorization based on permissions from both the Common Security Module (CSM) as well as Grid Grouper groups
    • Security Communication Metadata
      • Metadata providing the ability for two parties to negotiate a communication mechanism which meets the service’s requirements
    • Grid CA
      • APIs and Command Line for platform independent certificate authority
  • 15. Introduce Overview
    • A framework which enables fast and easy creation of strongly typed and highly interoperable grid services
    • Provides a powerful extension system wherein specific functionality can be added to the service or service editing process
      • Support for caDSR, GME, caGrid metadata, Data Services, and caGrid authorization services are all added this way
    • Abstracts all the details of the grid from the developer, allowing them to focus on the business logic being exposed
    • Provides a graphical environment
  • 16. Introduce Graphical Development Environment
    • GUI for creating and manipulating a grid service
      • Provides means of simple creation of service skeleton that a developer can then implement, build, and deploy
      • Automatic code generation of complete caBIG compliant grid service which is configured to provide:
        • Advertisement
        • Standard Metadata
        • Security
        • Complete Client API
  • 17. GAARDS Security Infrastructure
  • 18. Project Resources and Communication
    • caGrid Homepage:
      • https://cabig.nci.nih.gov/workspaces/Architecture/caGrid
      • http://www.cagrid.org
    • caGrid 1.0 Release:
      • Release Notes: http://gforge.nci.nih.gov/frs/shownotes.php?release_id=952
      • http://gforge.nci.nih.gov/frs/?group_id=25&release_id=952
    • caGrid 1.0 GForge Home:
      • Feature Requests
      • Bug Reports
      • Discussion Forums
      • Public Wiki
      • Downloads / Source Repository
      • http://gforge.nci.nih.gov/projects/cagrid-1-0/
    • caGrid Users Mailing List
      • https://list.nih.gov/archives/cagrid_users-l.html
      • [email_address]
    • Architecture Workspace
      • Community direction from Working Groups
      • Report out and feedback during WS calls
  • 19. Acknowledgements : caGrid Team
    • Ohio State University
      • Joel Saltz
      • Scott Oster
      • Shannon Hastings
      • Stephen Langella
      • David Ervin
      • Tahsin Kurc
    • Argonne National Laboratory
      • Ian Foster
      • William E. Allcock
      • Frank Siebenlist
      • Mike Wilde
      • Ravi Madduri
      • Jarek Gawor
      • Rachana Ananthakrishnan
    • Duke University
      • Patrick McConnell
    • Georgetown University
      • Steve Moore
      • Arnie Miles
      • Paul Kennedy
      • Chad La Joie
    • Science Applications International Inc.
      • Manav Kher
    • ScenPro Inc
      • David Wellborn
      • Val Bragg
    • SemanticBits, LLC
      • Vinay Kumar
    • Oracle Corp.
      • Christophe Ludet
    • Booz Allen Hamilton
      • Arumani Manisundaram
  • 20. Acknowledgements
    • National Cancer Institute Center for Bioinformatics
      • George Komatsoulis
      • Frank Hartel
      • Denise Warzel
      • Peter Covitz