Service Oriented Bioscience Cluster at OSC
Upcoming SlideShare
Loading in...5

Service Oriented Bioscience Cluster at OSC






Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds


Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Service Oriented Bioscience Cluster at OSC Service Oriented Bioscience Cluster at OSC Presentation Transcript

    • Service Oriented Bioscience Cluster at OSC Umit V. Catalyurek Associate Professor Dept. of Biomedical Informatics Dept. of Electrical & Computer Engineering The Ohio State University
    • Origins of caBIG
      • Goal: Enable investigators and research teams nationwide to combine and leverage their findings and expertise in order to meet NCI 2015 Goal.
      • Strategy: Create scalable, actively managed organization that will connect members of the NCI-supported cancer enterprise by building a biomedical informatics network
      “ Relieve suffering and death due to cancer by the year 2015”
    • Driving needs: cancer Biomedical Informatics Grid
      • A multitude of “legacy” information systems, most of which cannot be readily shared between institutions
      • An absence of tools to connect different databases
      • An absence of common data formats
      • A huge and growing volume of data must be collected, analyzed, and made accessible
      • Few common vocabularies, making it difficult, if not impossible, to interlink diverse research and clinical results
      • Difficulty in identifying and accessing available resources
      • An absence of information infrastructure to share data within an institution, or among different institutions
    • What is caBIG?
      • Common, widely distributed infrastructure that permits the cancer research community to focus on innovation
      • Shared, harmonized set of terminology, data elements, and data models that facilitate information exchange
      • Collection of interoperable applications developed to common standards
      • Cancer research data available for mining and integration
    • What is caGrid?
      • A grid based software infrastructure consisting of services, toolkits, APIs, and applications
      • A production grid deployment of the core services provided by that infrastructure
      • A community of developers leveraging that grid and infrastructure to provide applications and services to the cancer research community
    • What is caGrid?
      • Development project of Architecture Workspace
      • The Grid infrastructure for caBIG (the “G” in caBIG)
      • Driven from use cases and needs of cancer research community
      • Service Oriented Architecture
      • Based on federation
      • Model Driven
      • Object-Oriented, Semantically-Annotated Data Virtualization
    • What is caGrid? cont…
      • Builds on existing Grid technologies
      • Provides additional enterprise Grid components
        • Grid Service Graphical Development Toolkit
        • Metadata Infrastructure
        • Advertisement and Discovery
        • Semantic Services
        • Data Service Infrastructure
        • Analytical Service Infrastructure
        • Identifiers
        • Workflow
        • Security Infrastructure
        • Client tooling
    • caGrid Community Involvement
      • caGrid itself provides no real “data” or “analysis” to caBIG ™; its the enabling infrastructure which allows the community to do so
      • Community members add value to the grid as applications, services, and processes (for example: shared workflows)
        • caGrid provides the necessary core services, APIs, and tooling
      • The real “value” of the grid comes from bringing this information to the “end user”
      • Community members develop end user applications which consume of the resources provided by the grid
    • caGrid @ OSC
      • Goals:
        • Create an expandable caGrid Installation at OSC
        • Deploy Pilot Applications to demonstrate
          • Service Oriented Access to HPC resources
      • Dorian, GTS and Index services are deployed
      • SyncGTS along with Dorian and Index for performance
      • caGrid 1.2 was released this week, and we deployed it!
      • Image Mining for Performing Comparative Analysis of Expression Patterns in Tissue Microarrays
        • Project funded by NIH R01 (PI: David Foran, Co-PI: Joel Saltz)
      • Development of innovative analysis methods for analysis of tissue microarrays
        • Computation of features, annotations of image data based on features
      • Development of software support
        • to manage and share tissue microarray data and analysis results
        • to process large volumes of tissue microarray data on high performance systems
      • Development of ability to share data and analytical resources using caGrid
      • Supports Help Defeat Cancer project which 100,000 imaged histology specimens originating from breast, head & neck, colorectal cancers.
      Pilot Application : TMA
    • TMA Analytical Service Implementation
      • TMA Application is a pipelined workflow
        • Several processing steps that need to be applied in sequence to the images
        • Build a prototype workflow orchestration system
        • Wraps a program execution
          • Stages the the data in
          • Invoke the executable
          • Retrieve the output files
        • Uses caGrid’s bulk data transfer to move files from host to host
        • Interacts with a scheduler to allocate resources for the execution
          • Executable can be a parallel/distributed application
      • TMA user interface
        • Specify the workflow
          • List with executables and parameters
        • Invoke the service for the first stage
    • What is next?
      • Next Pilot Application: Prof. Dan Janies’ Supramap
        • Builds a phylogenetic tree and projects onto the map of the planet
        • Computationally expensive
      • Next Pilot Application(s): Your Application!?
      • More Info: and
      • Contact: Umit V. Catalyurek email: