An Overview of Bionimbus (March 2010)


Published on

This is a talk I gave at NHGRI in March 2010.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

An Overview of Bionimbus (March 2010)

  1. 1. An Overview of Bionimbus and the Open Cloud Consortium<br />Robert Grossman<br />Open Cloud Consortium <br />Institute for Genomics & Systems BiologyUniversity of Chicago<br />Laboratory for Advanced ComputingUniversity of Illinois at Chicago<br />
  2. 2. Part 1. Bionimbus<br /><br />
  3. 3. Web Portal & Widgets<br />Elastic Cloud Services<br />Database Services<br />Analysis Pipelines & Re-analysis Services<br />Scalable data transport<br />Large Data Cloud Services<br />Data Ingestion Services<br />
  4. 4. Case Study 1: Cistrack<br />Resource for cis-regulatory data.<br />Integrates databases and large data clouds.<br />Open source.<br />Contains raw data, intermediate, and analyzed data from approximately 300 experiments from Agilent, Affy and Solexa platforms.<br />
  5. 5. Flynet Provides Web 2.0 Access to Cistrack<br />
  6. 6. Cube is an Elastic Cloud For Re-analysis<br />
  7. 7. Case Study 2<br />71 rare, deleterious SNP genotypes were validated by Sequenom.<br />SNP concordance:<br />Alignment against gene models: <br />46%<br />TopHat alignment: <br />91%<br />Ran TopHat in Bionimbus using Cube-based VMs.<br />Total time went from 25 days to 1 day.<br />
  8. 8. Case Study 3<br />ssh<br />modENCODE Worm/Fly peak calling reanalysis<br />Virtual Machines<br />Working Space<br />Simple Persistent Storage (glusterfs)<br />ftp<br />Hypervisers<br />App<br />App<br />App<br />Racks of Hardware<br />OS<br />OS<br />OS<br />Private cloud (Eucalyptus & Cube)<br />
  9. 9. Hybrid Clouds<br />ami-efa24c86<br />Virtual Machines<br />Bionimbus virtual machine images <br />Hypervisers<br />App<br />App<br />App<br />Hardware Cluster<br />OS<br />OS<br />OS<br />Public cloud<br />Private / Community cloud<br />
  10. 10. Bionimbus Delivery Mechanisms<br />Login and use the Bionimbus cloud.<br />Use Bionimbus Virtual Machine Images in a) your private cloud; b) Bionimbus cloud; c) public clouds such as Amazon.<br />Bionimbus is open source and you can build your own cloud (and interoperate with ours) (First release of integrated system 3Q 2010)<br />Bionimbus data services for genomic data, even for large datasets <br />
  11. 11. Elastic Clouds<br />Large Data Clouds<br />Goal: Minimize cost of virtualized machines & provide on-demand. <br />HPC<br />Goal: Maximize data (with matching compute) and control cost.<br />Goal: Minimize latency and control heat.<br />
  12. 12. A successful cloud will…<br />Web 2.0/3.0 user interface<br />Compute services at the scale of a data center.<br />High speed network to move & share the data<br />Persist & refresh data over the long term<br />
  13. 13. Part 2. <br /><br />13<br />
  14. 14. 501(c)(3) Not-for-profit corporation<br />Develops standards, interoperability frameworks, and reference implementations.<br />Operates clouds.<br />Develops benchmarks.<br />One area of focus: bridge between private and public clouds.<br />14<br /><br />
  15. 15. Operates Clouds<br />500 nodes<br />3000 cores<br />1.5+ PB<br />Four data centers<br />10 Gbps<br />Target to refresh 1/3 each year.<br /><ul><li>Open Cloud Testbed
  16. 16. Open Science Data Cloud
  17. 17. Cloud-based Disaster Relief Services</li></li></ul><li>OCC Members<br />Companies: Yahoo, Cisco, Aerospace Corp., Booz Allen Hamilton, InfoBlox, Open Data Group, Raytheon <br />Universities: CalIT2, Johns Hopkins, Northwestern University, University of Chicago, University of Illinois at Chicago<br />Government agencies: NASA<br />16<br />
  18. 18. Open Cloud Consortium Perspective<br />Vendor neutral<br />Open, interoperable architecture<br />Experiment at scale<br />Operate infrastructure at the scale of a small data center<br />Long term point of view (think like a library not cloud service provider)<br />Think public, private & hybrid clouds<br />
  19. 19. Condo Clouds<br />Raywulf rack<br />
  20. 20. Open Cloud Testbed<br />C-Wave<br />CENIC<br />Dragon<br />Phase 2<br />9 racks<br />250+ Nodes<br />1000+ Cores<br />10+ Gb/s <br /><ul><li>Hadoop
  21. 21. Sector/Sphere
  22. 22. Thrift
  23. 23. KVM VMs
  24. 24. Eucalyptus
  25. 25. Nova</li></ul>MREN<br />19<br />
  26. 26. Open Science Data Cloud<br />Astronomical data<br />Biological data (Bionimbus)<br />Networking data<br />Image processing for disaster relief<br />20<br />
  27. 27. Applications<br />Apps<br />Compute Services<br />CloudMetadata Services<br />Data Services<br />PaaS<br />Storage Services<br />Identity Manager<br />Virtual Machine Manager<br />Virtual Network Manager<br />IaaS<br />Network Transport<br />
  28. 28. Standards<br /><ul><li>Platform as a Service
  29. 29. Cloud Compute Services
  30. 30. Data/Table Cloud Services
  31. 31. Cloud Storage Services</li></ul>Large Data Cloud Interoperability Framework<br />SNIA Cloud Data Management Interface (CDMI)<br /><ul><li>Infrastructure as a Service
  32. 32. Virtual Data Centers (VDC)
  33. 33. Virtual Networks (VN)
  34. 34. Virtual Machines (VM)</li></ul>Open Cloud Computing Interface (OCCI)<br />Open Virtualization Format (OVF)<br />
  35. 35. OCC Benchmarks<br />There are surprises.<br />
  36. 36. Acknowledgements<br />
  37. 37. Thank You<br />For more information:<br /><br /><br /> (for research papers, etc.)<br />