An Overview of Bionimbus and the Open Cloud ConsortiumRobert GrossmanOpen Cloud Consortium Institute for Genomics & Systems BiologyUniversity of ChicagoLaboratory for Advanced ComputingUniversity of Illinois at Chicago
Part 1.  Bionimbuswww.bionimbus.org
Web Portal & WidgetsElastic Cloud ServicesDatabase ServicesAnalysis Pipelines & Re-analysis ServicesScalable data transportLarge Data Cloud ServicesData Ingestion Services
Case Study 1: CistrackResource for cis-regulatory data.Integrates databases and large data clouds.Open source.Contains raw data, intermediate, and analyzed data from approximately 300 experiments from Agilent, Affy and Solexa platforms.
Flynet Provides Web 2.0 Access to Cistrack
Cube is an Elastic Cloud For Re-analysis
Case Study 271 rare, deleterious SNP genotypes were validated by Sequenom.SNP concordance:Alignment against gene models: 46%TopHat alignment: 91%Ran TopHat in Bionimbus using Cube-based VMs.Total time went from 25 days to 1 day.
Case Study 3sshmodENCODE Worm/Fly peak calling reanalysisVirtual MachinesWorking SpaceSimple Persistent Storage (glusterfs)ftpHypervisersAppAppAppRacks of HardwareOSOSOSPrivate cloud (Eucalyptus & Cube)
Hybrid Cloudsami-efa24c86Virtual MachinesBionimbus virtual machine images HypervisersAppAppAppHardware ClusterOSOSOSPublic cloudPrivate / Community cloud
Bionimbus Delivery MechanismsLogin and use the Bionimbus cloud.Use Bionimbus Virtual Machine Images in a) your private cloud; b) Bionimbus cloud; c) public clouds such as Amazon.Bionimbus is open source and you can build your own cloud (and interoperate with ours) (First release of integrated system 3Q 2010)Bionimbus data services for genomic data, even for large datasets
Elastic CloudsLarge Data CloudsGoal: Minimize cost of virtualized machines & provide on-demand.  HPCGoal: Maximize data (with matching compute) and control cost.Goal: Minimize latency and control heat.
A successful cloud will…Web 2.0/3.0 user interfaceCompute services at the scale of a data center.High speed network to move & share the dataPersist & refresh data over the long term
Part 2. www.opencloudconsortium.org13
501(c)(3) Not-for-profit corporationDevelops standards, interoperability frameworks, and reference implementations.Operates clouds.Develops benchmarks.One area of focus: bridge between private and public clouds.14www.opencloudconsortium.org
Operates Clouds500 nodes3000 cores1.5+ PBFour data centers10 GbpsTarget to refresh 1/3 each year.Open Cloud Testbed
Open Science Data Cloud
Cloud-based Disaster Relief ServicesOCC MembersCompanies: Yahoo, Cisco, Aerospace Corp., Booz Allen Hamilton, InfoBlox, Open Data Group, Raytheon Universities:  CalIT2, Johns Hopkins, Northwestern University, University of Chicago, University of Illinois at ChicagoGovernment agencies: NASA16
Open Cloud Consortium PerspectiveVendor neutralOpen, interoperable architectureExperiment at scaleOperate infrastructure at the scale of a small data centerLong term point of view (think like a library not cloud service provider)Think public, private & hybrid clouds
Condo CloudsRaywulf rack
Open Cloud TestbedC-WaveCENICDragonPhase 29 racks250+ Nodes1000+ Cores10+ Gb/s Hadoop
Sector/Sphere
Thrift
KVM VMs
Eucalyptus
NovaMREN19

An Overview of Bionimbus (March 2010)

  • 1.
    An Overview ofBionimbus and the Open Cloud ConsortiumRobert GrossmanOpen Cloud Consortium Institute for Genomics & Systems BiologyUniversity of ChicagoLaboratory for Advanced ComputingUniversity of Illinois at Chicago
  • 2.
    Part 1. Bionimbuswww.bionimbus.org
  • 3.
    Web Portal &WidgetsElastic Cloud ServicesDatabase ServicesAnalysis Pipelines & Re-analysis ServicesScalable data transportLarge Data Cloud ServicesData Ingestion Services
  • 4.
    Case Study 1:CistrackResource for cis-regulatory data.Integrates databases and large data clouds.Open source.Contains raw data, intermediate, and analyzed data from approximately 300 experiments from Agilent, Affy and Solexa platforms.
  • 5.
    Flynet Provides Web2.0 Access to Cistrack
  • 6.
    Cube is anElastic Cloud For Re-analysis
  • 7.
    Case Study 271rare, deleterious SNP genotypes were validated by Sequenom.SNP concordance:Alignment against gene models: 46%TopHat alignment: 91%Ran TopHat in Bionimbus using Cube-based VMs.Total time went from 25 days to 1 day.
  • 8.
    Case Study 3sshmodENCODEWorm/Fly peak calling reanalysisVirtual MachinesWorking SpaceSimple Persistent Storage (glusterfs)ftpHypervisersAppAppAppRacks of HardwareOSOSOSPrivate cloud (Eucalyptus & Cube)
  • 9.
    Hybrid Cloudsami-efa24c86Virtual MachinesBionimbusvirtual machine images HypervisersAppAppAppHardware ClusterOSOSOSPublic cloudPrivate / Community cloud
  • 10.
    Bionimbus Delivery MechanismsLoginand use the Bionimbus cloud.Use Bionimbus Virtual Machine Images in a) your private cloud; b) Bionimbus cloud; c) public clouds such as Amazon.Bionimbus is open source and you can build your own cloud (and interoperate with ours) (First release of integrated system 3Q 2010)Bionimbus data services for genomic data, even for large datasets
  • 11.
    Elastic CloudsLarge DataCloudsGoal: Minimize cost of virtualized machines & provide on-demand. HPCGoal: Maximize data (with matching compute) and control cost.Goal: Minimize latency and control heat.
  • 12.
    A successful cloudwill…Web 2.0/3.0 user interfaceCompute services at the scale of a data center.High speed network to move & share the dataPersist & refresh data over the long term
  • 13.
  • 14.
    501(c)(3) Not-for-profit corporationDevelopsstandards, interoperability frameworks, and reference implementations.Operates clouds.Develops benchmarks.One area of focus: bridge between private and public clouds.14www.opencloudconsortium.org
  • 15.
    Operates Clouds500 nodes3000cores1.5+ PBFour data centers10 GbpsTarget to refresh 1/3 each year.Open Cloud Testbed
  • 16.
  • 17.
    Cloud-based Disaster ReliefServicesOCC MembersCompanies: Yahoo, Cisco, Aerospace Corp., Booz Allen Hamilton, InfoBlox, Open Data Group, Raytheon Universities: CalIT2, Johns Hopkins, Northwestern University, University of Chicago, University of Illinois at ChicagoGovernment agencies: NASA16
  • 18.
    Open Cloud ConsortiumPerspectiveVendor neutralOpen, interoperable architectureExperiment at scaleOperate infrastructure at the scale of a small data centerLong term point of view (think like a library not cloud service provider)Think public, private & hybrid clouds
  • 19.
  • 20.
    Open Cloud TestbedC-WaveCENICDragonPhase29 racks250+ Nodes1000+ Cores10+ Gb/s Hadoop
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
    Open Science DataCloudAstronomical dataBiological data (Bionimbus)Networking dataImage processing for disaster relief20
  • 27.
    ApplicationsAppsCompute ServicesCloudMetadata ServicesDataServicesPaaSStorage ServicesIdentity ManagerVirtual Machine ManagerVirtual Network ManagerIaaSNetwork Transport
  • 28.
  • 29.
  • 30.
  • 31.
    Cloud Storage ServicesLargeData Cloud Interoperability FrameworkSNIA Cloud Data Management Interface (CDMI)Infrastructure as a Service
  • 32.
  • 33.
  • 34.
    Virtual Machines (VM)OpenCloud Computing Interface (OCCI)Open Virtualization Format (OVF)
  • 35.
  • 36.
  • 37.
    Thank YouFor moreinformation:www.bionimbus.orgwww.opencloudconsortium.orgrgrossman.com (for research papers, etc.)